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Abstract 

Unique  database  requirements  in  the  realm  of  non-standard  applications  (such  as 
computer-aided  design  (CAD),  computer-aided  software  engineering  (CASE),  and  office 
information  systems  (OIS))  have  driven  the  development  of  new  data  models  and  database 
systems  based  on  these  new  models.  In  particular,  the  goal  of  these  new  database  systems 
is  to  exploit  the  advantages  of  complex  data  models  that  are  more  efficient  (in  terms  of 
time  and  space)  than  their  relational  counterparts. 

This  research  effort  describes  the  design  and  implementation  of  the  Triton  nested 
relational  database  system,  a  prototype  system  based  on  the  nested  relational  data  model. 
Triton  is  intended  to  be  used  as  the  backed  storage  component  for  some  non-standard 
application.  To  quickly  prototype  the  system,  the  EXODUS  extensible  database  system  is 
used  in  the  development  of  Triton. 

The  research  presented  in  this  document  focuses  on  Triton’s  operator  and  access 
methods,  and  compares  the  performance  of  the  nested  relational  model  versus  the  relational 
model  using  these  methods.  In  addition,  the  effectiveness  of  the  EXODUS  extensible 
database  toolkit  is  evaluated. 
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ACCESS  AND  OPERATOR  METHODS  FOR  THE 
TRITON  NESTED  RELATIONAL  DATABASE  SYSTEM 


I.  Introduction 


1.1  Overview 

Information  storage  and  retrieval  plays  an  increasingly  important  role  in  our  lives. 
Banking  transactions,  military  command  and  control,  and  hospital  record  keeping  are  but 
a  few  examples  of  our  dependence  upon  the  prompt  processing  and  efficient  management 
of  large  volumes  of  data. 

Since  the  early  1960’s,  computerized  databases  have  proven  to  be  an  indispensible 
way  to  manage  information  (19).  A  database  management  system  (DBMS)  consists  of  two 
parts:  a  set  of  computer  data  and  programs  that  provide  access  to  that  data  (18:1).  A 
DBMS  is  different  from  a  file  system  because  in  a  file  system  only  small  portions  of  data 
can  be  examined  or  updated  at  a  time;  file  systems  do  not  allow  the  efficient  correlation 
of  data  in  separate  files.  DBMSs  provide  this. correlation  in  a  fast  and  easy  fashion. 

In  recent  years,  database  research  has  focused  oh  the  development  of  database 
systems  to  support  non-standard  applications,  such  as  computer-aided  design  (CAD), 
computer-aided  software  engineering  (CASE),  and  office  information  systems  (OIS).  Re¬ 
quirements  in  these  new  application  areas  have  driven  the  development  of  new  data  models 
and  database  systems  based  on  these  new  models  to  efficiently  manage  large  volumes  of 
non-standard,  or  complex,  data  (such  as  textural  or  pictoral  information). 

The  Triton  nested  relational  database  system  is  a  prototype  system  based  on  an 
extension  of  the  relational  model  (8),  called  the  nested  relational  model  (32),  which  allows 
a  hierarchical  representation  of  complex  objects.  The  primary  goal  of  lliis  research  effort 
is  to  develop  and  implement  operator  and  access  methods  for  the  Triton  system  that 
efficiently  exploit  this  hierarchical  representation  of  data.  The  intended  use  of  the  Triton 
system  is  as  the  backend  for  applications  which  require  database  support  not  provided  by 
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current  relational  databases,  such  as  CASE  or  CAD  tools.  Thus,  queries  to  the  system 
will  be  made  via  embedded  calls  within  the  application  program,  and,  as  such,  Triton  will 
be  dealing  with  known  queries  as  opposed  to  ad  hoc  queries. 

The  development  of  a  DBMS  is  a  mamoth  task  in  terms  of  time  and  the  amount 
of  programming  required.  In  order  to  quickly  prototype  the  Triton  system  we  utilized 
EXODUS  (7),  an  extensible  database  “toolkit”  developed  at  the  University  of  Wisconsin. 
EXODUS  provides  several  necessary  facilities  and  tools  that  aid  in  the  development  of 
Triton’s  access  methods,  which  include  a  storage  manager  and  a  persistent  programming 
language,  called  E.  Not  only  does  this  research  effort  provide  Triton  the  ability  to  process 
queries  of  any  type  (except  for  data  restructuring  queries),  it  also  provides  the  means  to 
evaluate  the  advantages  and  disadvantages  of  using  extensible  systems  such  as  EXODUS 
and  a  persistent  programming  language  to  build  application-specific  database  management 
systems. 

The  purpose  of  this  chapter  is  to  define  the  scope  and  purpose  of  this  research  effort. 
The  relational  model  is  summarized,  followed  by  an  introduction  of  the  nested  relational 
model.  The  Triton  nested  database  system  is  then  described,  whicu  leads  into  the  purpose 
and  scope  of  my  research. 

1.2  Background 

1.2.1  Relational  Database  Model.  In  1970,  Codd  proposed  the  formal  relational 
database  model  (8).  In  this  model,  information  is  stored  in  a  tabular  format  called  a 
relation  which  is  composed  of  attributes,  or  fields,  and  records,  or  tuples.  A  column  in 
a  table  represents  an  attribute  and  a  row  is  a  single  record.  As  an  example,  Figure  1.1 
shows  a  simple  relation  depicting  information  on  employees  of  a  company.  In  the  relational 
model,  all  attributes  must  be  atomic,  or  undecomposable.  In  other  words,  an  attribute 
can  only  contain  one  integer,  one  real  number,  or  a  single  character  string.  “A  relation 
which  only  has  atomic  valued  attributes  is  said  to  be  in  first  normal  form  (INF).”  (23:7) 

1.2.2  The  Nested  Relational  Model.  The  nested  relational  model  (NRM)  is  an  ex¬ 
tension  of  the  relational  model  that  relaxes  the  INF  restriction  and  allows  attributes  of  a 
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dept 

empjname 

emp.age 

empssn 

Marketing 

John  Smith 

27 

237-46-3567 

Research 

Michael  Taylor 

31 

395-73-8901 

Advertising 

Tina  Therrien 

43 

555-12-3434 

Personnel 

Ca,rla  Dunlap 

37 

624-35-8152 

Figure  1.1.  The  Flat  Employee  Relation  Without  Children 


relation  to  be  a  set  of  values  (20),  or  possibly,  another  relation  (32).  The  NRM  has  two 
advantages  over  the  relational  model  for  the  storage  of  hierarchical  data:  decreased  storage 
requirement  and  increased  processing  speed.  These  advantages  are  best  illustrated  by  an 
example.  Assume  we  wish  to  store  information  on  the  employees  of  a  company  and  their 
children.  If  a  relational  database  is  used  to  store  this  information,  it  requires  the  relational 
tables  shown  in  Figures  1.1  and  1.2.  Notice  that  the  empssn  attribute  is  duplicated  in 
both  tables  so  that  the  correspondence  between  employees  and  their  children  is  not  lost. 
This  attribute  duplication  clearly  requires  additional  data  storage.  However,  if  a  nested 
relational  database  is  used  to  store  this  information,  only  a  single  table  is  required  as 
depicted  in  Figure  1.3.  Because  children  are  nested  directly  under  employees  in  the  same 
table,  the  empssn  field  does  not  have  to  be  duplicated  to  correlate  employee  data  and 
children  data.  In  addition,  a  query  on  the  relational  database  involving  both  employee 
data  and  children  data  would  require  the  two  tables  to  be  joined,  which  takes  time.1  Since 
all  data  is  kept  in  one  table  in  the  nested  relational  model,  a  costly  join  does  not  have 
to  take  place.  An  intangible  advantage  of  the  NRM  over  the  relational  model  is  a  more 
intuitive  mapping  of  complex  data;  the  relational  model  splits  the  data  into  INF  “chunks”, 
while  the  NRM  allows  data  to  remain  in  hierarchical  form  within  the  database  schema. 

However,  the  nested  relational  model  does  have  some  disadvantages.  The  schema 
of  relations  is  more  complex  due  to  relation-valued  attributes.  In  addition,  because  data 
is  not  normalized,  updates  to  a  nested  relation  may  require  the  modification  of  several 
tuples  as  opposed  to  the  modification  of  one  tuple  in  the  relational  version.  Finally,  the 
nested  relational  model  does  not  allow  the  sharing  of  relation-valued  attributes  between 

1  Such  a  query  might  be,  “What  are  the  names  of  Michael  Taylor’s  children?” 
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empssn 

child-name 

child.age 

sex 
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8 
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Todd 

4 

M 
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Susan 

3 

F 

555-12-3434 

Laura 

18 

F 

555-12-3434 

John 

13 

M 

555-12-3434 

Matthew 

11 

M 

624-35-8152 

George 

5 

M 

624-35-8152 

Janis 

3 

MM 

Figure  1.2.  The  Children  Relation 
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Figure  1.3.  The  Nested  Employee  Relation 


tuples  of  nested  relations.  However,  for  the  Triton  system,  we  feel  the  advantages  of  the 
nested  relational  model  outweigh  its  disadvantages,  particularly  since  the  nested  model 
outperforms  the  relational  model  for  complex  data  in  terms  of  query  execution  time  (as 
shown  in  Chapter  4). 

1.3  Purpose  of  Thesis 

The  goal  of  this  research  is  to  provide  Triton  the  ability  to  process  queries  of  any 
type  (except  for  data  restructuring  queries)  and  to  lay  the  foundation  for  future  work  in 
this  area.  In  particular,  this  research  centers  on  the  design  and  implementation  of  access 
and  operator  methods  that  take  advantage  of  the  hierarchical  nature  of  the  nested  data. 
Because  the  EXODUS  extensible  database  system  isused  in  the  development  of  the  Triton 
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QUERY 


RESPONSE 


Figure  1.4.  The  Triton  Nested  Database  System  Architecture 

system,  the  operator  and  access  methods  are  developed  within  the  constraints  dictated 
by  EXODUS.  The  two  areas  where  these  constraints  are  felt  most  strongly  are  in  the 
representation  of  the  nested  relational  model  using  the  persistent  programming  language 
constructs  of  the  E  programming  language  provided  by  EXODUS  (25),  and  access  to  these 
persistent  structures  via  the  EXODUS  storage  manager  (5).  These  issues  are  discussed  in 
more  detail  in  Chapter  4. 

1.3.1  The  Triton  Nested  Relational  Database  System  Architecture.  The  scope  of 
this  research  effort  is  best  explained  within  the  context  of  the  operation  of  the  Triton 
system.  Query  processing  in  this  system,  depicted  in  Figure  1.4,  is  explained  below. 

The  user  (or  application  program)  inputs- a  query  that  is  written  in  a  special  query 
language  for  nested  relations,  called  SQL/NF  (26).  By  accessing  the  database  schema 
via  the  catalog  manager,  the  parser  translates  the  query  into  an  unambiguous  algebra  for 
computer  manipulation.  This  algebraic  representation  is  contained  in  a  data  structure 
called  a  query  tree ,  which  consists  of  a  series  of  nodes  and  pointers  to  other  nodes.  Each 
query  node  contains  a  relational  algebra  operator  or  an  access  method,  the  relation(s)  the 
method  refers  to,  and  any  oolean  conditions  to  apply. 
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The  query  tree  is  passed  to  the  rule-based  optimizer  where  it  is  changed  to  a  plan  tree. 
In  the  optimizer,  the  sequence  and  contents  of  the  nodes  in  the  query  tree  are  changed 
into  an  optimized  plan  tree  and  relational  operators  are  replaced  with  specific  operator 
methods. 

The  E  code  generator  traverses  the  plan  tree  and  generates  the  E  code  to  effect  the 
operator  and  access  methods  to  perform  the  query.  The  E  compiler  links  the  methods  with 
the  E  code  that  specifies  the  relation  definitions.  The  resulting  object  code  is  executed  to 
manipulate  the  information  in  the  database  via  the  storage  manager.  The  manipulated 
data  is  returned  to  the  user  as  the  response  to  the  query. 

1.S.2  Algebra  Used  in  the  Triton  System.  The  implementation  of  the  operator  and 
access  methods  relies  on  a  structural  representation  of  the  nested  relational  algebra  de¬ 
veloped  by  Latha  Colby  (9).  Colby  defined  a  recursive  algebra  for  nested  relations  that 
builds  on  the  traditional  relational  algebra  operators.  Her  operators  support  the  retrieval 
of  information  from  any  level  of  nesting  in  a  relation  without  first  “flattening  out”  the 
relation  to  fit  the  relational  model  (9:276). 

Colby’s  algebra  uses  an  extensive  group  of  set  operators  and  redefines  the  select  (a), 
project  (tt),  and  join  (M)  operators  of  the  relational  model  for  use  on  nested  objects.  In 
addition,  she  defined  two  new  operators,  nest  ( u )  and  unnest  (//).  The  nest  operator 
restructures  the  database  into  a  nested  form,  while  the  unnest  operator  “flattens  out”  the 
nested  relation  into  a  relational  form.  Appendix  A  gives  a  brief  summary  of  each  of  Colby’s 
relational  operators  and  demonstrates  their  use  on  example  nested  relations. 

1.4  Scope  of  this  Research/Objective 

The  work  accomplished  in  this  research  effort  is  summarized  as  follows: 

1.  Design  and  implementation  of  the  operator  methods  of  the  Colby  algebra  to  handle 
multiple  levels  of  nesting,  including; 

•  The  filescan  method  that  implements  a  project  and/or  a  select 

•  The  loops.join  method  that  implements  a  join  as  well  as  a  project  and/or  select 


2.  Design  and  implementation  of  the  access  methods  to  modify  data  in  the  database, 
including: 

•  The  store-values  method  that  adds  data  to  a  relation  at  any  level  or  levels  of 
nesting 

•  The  modify  method  that  modifies  data  in  a  relation  at  any  level  or  levels  of 
nesting 

•  The  delete  method  that  deletes  data  from  a  relation  at  any  level  or  levels  of 
nesting 

•  The  creafe.re/  method  that  adds  a  new  relation  to  the  database 

•  The  drop.rel  method  that  deletes  a  relation  from  the  database 

3.  Design  and  implementation  of  the  code  generator,  codegen ,  to  implement  the  operator 
and  access  methods 

4.  Testing  of  methods  and  code  generator 

5.  Comparison  of  the  performance  of  the  nested  relational  model  versus  its  normalized 
(INF)  version  using  the  operator  methods  and  code  generator: 

•  Development  of  an  E  representation  of  a  sample  set  of  relational  IDEFo  language 
datai(23) 

•  Development  of  an  E  representation  of  nested  relational  IDEFo  language  data  (23) 

•  Design  of  programs  to  load  data  into  both  E  representations 

•  Creation  of  queries  for  both  representations  to  evaluate  performance  of  the  NRM 
against  the  traditional  relational  model 

•  Comparison  of  the  NRM  against  the  relational  model  based  on  code  generation 
time  and  query  execution  time 

6.  Discussion  of  the  advantages  and  disadvantages  of  using  the  EXODUS  toolkit  in  this 
research  effort 
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1.5  Methodology /Approach 

EXODUS  provides  several  facilities  and  tools  for  the  construction  of  application- 
specific  DBMSs.  One  such  tool  is  the  E  programming  language  and  its  compiler  (25,  24) 
which  is  an  extension  of  C++  (31)  and  builds  on  the  object-oriented  nature  of  C++. 
By  exploiting  the  powerful  object-oriented  capabilities  of  the  E  and  C++  programming 
languages,  the  access  and  operator  methods  (which  are  described  in  detail  in  Chapter  3) 
are  implemented  as  iterator  functions  (25)  using  a  recursive  procedure;  whenever  a  nested 
attribute  is  encountered  in  the  plan  tree,  the  procedure  is  recursively  called.  The  use  of 
recursion  greatly  simplifies  the  design  and  permits  any  level  of  nesting  in  the  query. 

The  E  code  generator  implements  the  access  and  operator  methods  by  setting  up 
an  iterate  loop  that  calls  the  operator  methods.  The  operator  and  access  methods  are 
tested  using  sample  queries  on  various  database  schemas  to  ensure  the  prototype  database 
performs  correctly. 

The  Triton  system  is  used  to  store  data  on  Structured  Analysis  (SA)  diagrams  (22) 
using  the  NRM  representation  as  well  as  a  relational  representation  of  the  same  data.  The 
schemas  used  are  based  on  the  work  of  Captain  Gerald  Morris  (23).  The  performance  of 
the  NRM  representation  r  compared  to  the  relational  representation;  in  particular,  the 
advantage  of  the  NRM  over  the  relational  model  is  demonstrated  with  respect  to  code 
generation  time  and- query  execution  time. 

All  code  produced  in  this  research  effort  is  documented  in  accordance  with  Air  Force 
Institute  of  Technologpsystem  development  documentation  guidelines  and  standards  (15). 

1.6  Materials  and  Equipment 

This  research  effort  utilized  the  EXODUS  facilities  and  tools  on  a  Sun  3  workstation. 
The  developers  of  EXODUS  at  the  University  of  Wisconsin  released  EXODUS  software 
modifications  which  were  implemented  at  the  Air  Force  Institute  of  Technology. 
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1.7  Outline  of  this  Document 

\ 

Chapter  2  describes  the  historical  development  of  the  NRM  and  presents  significant 
work  in  NRM  theory  related  to  this  research  effort.  This  chapter  also  presents  the  EXODUS 
extensible  database  system.  Chapter  3  begins  with  a  discussion  of  the  E  programming 
language  representation  of  nested  relations.  The  chapter  continues  with  a  description  of 
Triton’s  operator  and  access  methods,  including  a  presentation  of  the  E  code  generator 
that  implements  these  methods.  Chapter  4  evaluates  the  performance  of  the  NRM  versus 
the  relational  model  using  the  access  and  operator  methods  and  discusses  the  advantages 
and  disadvantages  of  using  the  EXODUS  toolkit  in  the  development  of  the  Triton  system. 
Chapter  5  provides  a  summary  of  this  work  and  suggests  areas  for  further  research. 
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II.  Overview  of  the  Nested  Relational  Model  and  the  EXODUS  Extensible  Database 

System 


2.1  Overview 

This  chapter  provides  an  overview  of  key  research  in  the  area  of  nested  relational 
database  systems  as  related  to  the  development  of  the  Triton  nested  relational  database 
system.  In  the  sections  that  follow,  the  motivation  for  the  development  of  the  nested 
relational  model  is  discussed,  followed  by  a  brief  summary  of  its  historical  development. 
Notable  advances  in  nested  relational  database  theory  are  also  presented,  including  a  brief 
survey  of  nested  relational  database  implementations.  Finally,  the  chapter  concludes  with 
a  description  of  the  EXODUS  extensible  database  system,  and  how  EXODUS  is  used  in 
the  development  of  Triton. 

2.2  Impetus  for  the  Development  of  the  Nested  Relational  Model 

The  development  of  database  management  systems  has  been  driven  by  the  changing 
needs  of  their  users.  The  requirement  for  fast  access  to  reliable  data  by  multiple  users 
sparked  the  creation  of  database  management  systems  in  the  1960’s.  For  the  next  twenty 
years,  the  hierarchical  and  network  models  dominated  the  field  of  database  design.  In  1970, 
Godd  (8)  introduced  a  data  model  based  on  a  tabular  format,  called  the  relational  model. 
However,  widespread  use  of  the  relational  model  did  not  occur  until  the  late  1970’s  when 
straightforward  query  languages  for  the  relational  model  were  developed  and  prototype 
systems  validated  the  model’s  efficiency  (30:2). 

Today,  the  major  push  in  the  realm  of  database  research  comes  from  a  demand  for 
non-standard  or  non-business  applications.  These  applications  include  engineering  design, 
such  as  computer-aided  design  (CAD)  and  computer-aided  software  engineering  (CASE),  as 
well  as  office  automation  (33:2).  These  new  applications  require  more  complex  data  models 
to  efficiently  map  hierarchical  data.  “A  database  model  should  allow  databases  to  be 
viewed  in  a  manner  that  is  based  upon  the  meaning  of  data  as  seen  by  its  users...”  (4:xvii). 
The  nested  relational  model  is  an  attempt  to  represent  complex  data  within  a  relational 
model  framework. 
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In  the  nested  relational  model,  as  with  the  relational  model,  information  is  stored  in 
a  tabular  format.  However,  attributes  in  the  relation  do  not  have  to  be  atomic, 'but  can 
have  set- valued  or  relation- valued  attributes.  This  allows  information  to  be  stored  in  the 
database  in  a  way  that  corresponds  to  the  user’s  interpretation  of  the  data.  Traditional 
relational  systems  break  complex  data  into  several  first  normal  form  relations.  Since  non¬ 
standard  applications  involve  queries  that  require  access  to  many  of  these  relations  at  once, 
computationally  expensive  joins  are  needed  to  correlate  all  the  necessary  data.  Nested 
relations  require  fewer  joins  for  complex  data  since  the  data  is  spread  across  fewer  relations, 
making  the  data  model  less  confusing  to  users  and  database  processing  more  efficient. 

2.3  Historical  Development  of  the  Nested  Relational  Model 

2.3.1  The  Relational  Model.  The  nested  relational  model  is  the  descendant  of  the 
relational  model  introduced  by  Codd  (8).  Codd’s  seminal  work  represents  data  as  tables, 
called  relations.  Data  in  the  relations  are  accessed  via  a  key  that  uniquely  indentifies 
each  row,  or  tuple,  of  the  relation.  Figure  2.1  shows  a  simple  relational  database  with 
five  relations  that  hold  information  on  VHSIC  Hardware  Description  Language  (VHDL) 
designs  (l).1  In  the  relational  model,  all  attributes  of  every  relation  must  be  atomic,  or 
non-decomposable.  In  other  words,  attributes  may  contain  only  one  character  string,  one 
integer,  or  one  real  number.  When  a  relation  conforms  to  this  constraint,  it  is  in  first 
normal  form  (INF). 

2.3.2  Introduction  of  Set-Valued  Attributes.  Makinouchi  (20)first  suggested  remov¬ 
ing  the  INF  requirement,  laying  the  groundwork  for  the  nested  relational  model.  A  model 
allowing  attributes  to  have  sets  of  values  (called  set- valued  attributes)  was  introduced  by 
Jaeshke  and  Schek  (17).  Figure  2.2  shows  the  relational  database  for  the  VHDL  example 
used  in  Figure  2.1  where  set-valued  attributes  are  allowed.  Notice  that  the  only  relation 
affected  by  the  change  is  the  SYS.TO.COMP  relation,  where  the  value  for  the  COMP # 
attribute  is  a  set  of  component  numbers.  Using  set-valued  attributes  limits  the  size  of  the 
database  because  SYS#  does  not  have  to  be  repeated  for  every  occurance  of  COMP#. 

1 VHSIG  stands  for  very  high  speed  integrated  circuits. 
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FigUre  2.1.  Representation  of  a  Relational  VHDL  Database 

2.S.3  Introduction  of  Relation-Valued  Attributes.  Thomas  and  Fischer  (32)  went 
another  step  further,  suggesting  that  attributes  of  a  relation  be  allowed  to  hold  not  only 
sets  of  values,  butcomplete  relations.  A  database  that  allows  relation- valued  attributes  is 
called  a  nested  relational  database.  Figure  2.3  shows  the  nested  relational  database  for  the 
VHDL  example.  This  database  is  composed  of  two  nested  relations  with  one  level  of  nesting 
in  each.  The  advantage  of  this  model  comes. from  the  storage  savings  realized  because  the 
SYS#  and  COMP#  attributes  are  not  each  repeated  three  times  in  the  database  schema 
(as  they  are  in  the  two  previous  database  schemas).  In  addition,  only  one  join  is  required 
to  correlate  all  data  in  the  nested  database,  as  compared  to  the  four  joins  required  in  each 
of  the  two  previous  schemas. 
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Figure  2.2.  Representation  of  a  VHDL  Database  with  Set- Valued  Attributes 
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Figure  2.3.  Representation  of  a  VHDL  Database  with  Relation- Valued  Attributes 
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2.4  Notable  Advances  in  Nested  Relational  Database  Theory 

Following  the  development  of  the  nested  rela.  kmal  model  (NRM),  research  in  the 
areas  of  query  languages,  nested  relational  algebra  and  its  optimization,  as  well  as  special¬ 
ized  access  methods  have  demonstrated  the  viabilty  of  the  NRM.  The  next  three  sections 
review  the  work  in  each  of  these  three  areas  relevant  to  the  development  of  the  Triton 
nested  relational  database  system. 

2.4.I  The  SQL/NF  Query  Language.  In  1987,  Roth,  Korth,  and  Batory  (26)  de¬ 
veloped  an  extended  version  of  the  SQL  query  language  for  use  on  nested  relations,  called 
SQL/NF.  As  with  SQL,  SQL/NF  uses  SELECT-FROM-WHERE  (SFW)  expressions  to 
pose  queries  involving  any  level  of  nesting  in  a  nested  relation.  Unlike  SQL,  SQL/NF 
allows  nested  SFW  expressions  in  both  the  SELECT  and  FROM  clauses  to  manipulate 
relation-valued  attributes.  Using  the  nested  relational  schema  of  Figure  2.3,  an  SQL/NF 
query  to  retrieve  the  system  name  and  port  names  for  system  14701  is 


SELECT  name,  (SELECT  name 
FROM  ports) 

FROM  systems 
WHERE  sys#  =  14701 

The  outer  SFW-expression  retrieves  the  name  of  the  system  where  the  system  number  is 
14701.  The  inner  (nested)  SFW-expression  selects  the  name  of  each  port  of  system  14701. 

An  SQL/NF  query  to  retrieve  the  component  names  and  component  port  names  for 
the  components  of  system  14701  is 


SELECT  components. name,  (SELECT  name 

FROM  components. ports) 

1  nun  oj  o  tcmo  ,  C0ui|jwiicm/0 

WHERE  systems . comp#  =  components . comp#  AND  sys#  =  14701 

The  Cartesian  product  of  the  two  relations,  systems  and  components  is  formed,  and  the 
only  tuples  selected  are  those  where  the  comp#  attribute  of  systems  and  components  is  the 
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same  and  sys #  is  1470j..  Then  the  name  attributes  of  components  and  ports  is  projected 
out. 

These  simple  examples  do  not  completely  demonstrate  the  full  power  of  SQL/NF. 
However,  they  do  show  how  complex  queries  can  be  simply  posed  to  a  nested  relational 
database.  Because  of  its  simplicity  and  understandability,  SQL/NF  was  chosen  as  the 
query  language  for  the  Triton  system  (28). 

2.4.2  Colby  Algebra  and  its  Optimization.  While  query  languages,  such  as  SQL/NF, 
are  intended  to  provide  an  easy  query  capability  for  database  users,  rigid  mathematical 
algebras  have  been  adopted  for  use  on  the  relational  and  nested  relational  models  to  for¬ 
malize  the  representation  of  queries.  Natural  language  queries  are  usually  translated  into 
the  more  rigorous  algebra  before  being  processed.  The  first  algebras  developed  for  the 
NRM  (32)  required  nested  relations  to  be  restructured  to  a  relational  format  before  infor¬ 
mation  could  be  extracted  from  the  database.  However,  Colby  (9)  suggested  a  recursive 
algebra  that  retrieves  information  from  any  level  of  nesting  in  a  relation  without  first 
restructuring  the  relation. 

Colby  redefined  the  select  (cr),  project  (tt),  and  join  (M)  operators  of  the  relational 
model,  and  introduced  two  restructuring  operators  nest  (f)  and  unnest  (/1).  She  also 
decribed  optimization  techniques  for  her  recursive  algebra.  Appendix  A  presents  the  op¬ 
eration  of  the  Colby  algebra  by  some  examples.  Because  of  the  ease  with  which  SQL/NF 
is  mapped  into  the  Colby  algebra,  the  Colby  algebra  was  selected  for  the  Triton  system. 

2.J,.3  Indexing  Techniques  and  Access  Methods.  In  databases,  it  is  sometimes  useful 
to  have  an  index  on  an  attribute  or  several  indices  on  a  number  of  attributes  to  facilitate 
quick  access  to  a  particular  value.  In  the  next  two  subsections,  two  papers  are  reviewed 
that  discuss  indexing  techniques  and  their  use  as  access  methods  for  nested  relational 
databases.  In  the  first  article,  Bertino  and  Kim  (3)  present  three  indexing  techniques 
for  use  on  nested  relations.  In  the  second  article,  the  developers  of  the  ANDA  nested 
database  system  (11)  discuss  a  unique  indexing  structure  and  its  use  for  efficient  access 
to  information  in  their  database.  At  the  present  time,  no  indexing  techniques  have  been 
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implemented  for  the  Triton  system.  As  such,  these  two  articles  are  intended  to  serve  as  a 
recommended  approach  for  future  developers  of  the  Triton  system. 

2.Jt.S.l  Bertino  and  Kim’s  Indexing  Techniques.  The  first  indexing  technique 
introduced  by  Bertino  and  Kim  sets  up  a  nested  index ,  that  correlates  the  value  of  an 
atomic  attribute  at  some  level  of  nesting  with  the  indices  of  those  tuples  at  the  outermost 
level  of  the  relation  that  contain  that  value.  The  second  index  type,  called  the  path  index 
correlates  the  value  of  an  atomic  attribute  at  some  level  of  nesting  with  a  set  of  path  indices 
along  every  level  of  nesting  correponding  to  those  tuples  with  that  value.  The  third  and 
final  index  type  introduced  by  Bertino  and  Kim  is  the  multiindex  which,  given  a  specific 
path,  creates  a  separate  index  for  each  subpath. 

Following  the  presentation  of  the  three  indexing  techniques,  Bertino  and  Kim  com¬ 
pared  them  based  on  storage  cost,  retrieval  cost,  and  update  cost.  They  concluded  that 
the  nested  index  has  the  lowest  storage  cost  and  the  best  retrieval  performance,  while  the 
multiindex  is  best  with  respect  to  update  performance. 

2. 4. 3.2  ANDA  Nested  Database  System.  Deshpande  and  Van  Gucht  (11)  im¬ 
plemented  a  nested  relational  database  called  ANDA.  Of  particular  interest  to  this  re¬ 
search  are  the  access  methods  and  mechanisms  ANDA  uses  for  the  retrieval  of  data  from 
the  database.  ANDA  makes  a  distinction  between  value-driven  operations,  such  as  select, 
join,  and  nest,  and  structure-oriented  operations,  such  as  project  and  unnest. 

Speed  of  retrieval  of  information  is  maximized  for  both  types  of  operations  by  using 
two  different  storage  structures;  VALTREE  is  the  value-driven  indexing  structure,  and 
RECLIST  is  a  special  record-list  structure  used  for  structure-oriented  operations.  VAL¬ 
TREE  maps  values  to  a  list  of  tuple  identifiers  in  all  relations  and  relation-  values  attributes- 
that  contain  that  value.  RECLIST  maps  these  tuple  identifiers  to  the  actual  physical  ad- 
ditoses  where  these  tuples  are  stored.  For  increase  efficiency,  ANDA  stores  tuple  identifiers 
obtained  from  VALTREE  in  a  special  cache  consisting  of  a  set  of  stacks.  Once  these  tu¬ 
ple  identifiers  have  been  processed  within  the  cache  as  specified  by  the  query,  RECLIST 
retrieves  the  actual  values  or  tuples  in  response  to  die  query. 
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2.5  Related  Work 


This  section  is  intended  to  provide  a  brief  survey  of  nested  relational  database  imple¬ 
mentations.  This  section  describes  the  Advanced  Information  Management  Prototype,  the 
Darmstadt  Database  System,  and  the  Verso  DBMS,  focusing  on  their  storage  management 
techniques  and  access  methods. 

2.5.1  Advanced  Information  Management  Prototype  (AIM-P).  The  Advanced  In¬ 
formation  Management  Prototype  (AIM-P)  (10)  was  developed  by  IBM  as  a  research  ve¬ 
hicle  in  the  realm  of  non-standard  applications.  There  are  two  similarities  between  AIM-P 
and  the  Triton  system:  (1)  both  are  intended  to  be  the  database  implementation  “back¬ 
end”  for  design  application  tools,  and  (2)  both  use  the  nested  relational  data  model  to 
represent  the  underlying  structure  of  the  database. 

Because  AIM-P  was  developed  completely  from  scratch,  the  developers  had  the  free¬ 
dom  to  implement  their  design  to  take  full  advantage  of  the  underlying  data  model  (as 
opposed  to  the  Triton  system,  which  was  constrained  by  the  specific  capabilities  of  the 
EXODUS  storage  manager).2  Thus,  AIM-P’s  physical  storage  structure  differs  from  Tri¬ 
ton’s  in  that  Triton  maps  relation  definitions  into  a  programming  language,  while  AIM-P 
uses  a  tree  structure  to  hold  the  same  information.  The  advantage  of  a  tree  structure 
representation  of  data  is  that  tuples  can  be  added  or  deleted  quickly  at  any  level  of  nesting 
by  simply  adding  or  deleting  a  data  pointer  to  the  appropriate  internal  node  of  the  tree. 

2.5.2  Darmstadt  Database  System  (DASDBS).  Developed  at  the  Technical  Uni¬ 
versity  of  Darmstadt,  the  Darmstadt  Database  System  (DASDBS)  (16,  27)  is  “...a  kernel 
that  integrates  the  common  features  of  a  rather  low-level  storage  component,  but  allows 
efficient  and  flexible  front  ends  tailored  to  specific  application  classes...”  (27:51). 

The  DASDBS  kernel  provides  access  (such  as  reading,  insertion,  and  deletion)  to  sets 
of  complex  objects  as  opposed  to  a  one-record-at-a-time  interface  by  fetching  or  storing  lists 
of  pages  via  a  variable  size  buffer.  Thus,  a  single  scan  of  a  complex  object  retrieves  all  of 
the  values  of  its  sub-objects,  which  limits  the  number  of  disk  accesses.  This  is  very  similar 

2This  is  explained  in  more  detail  in  Chapter  4. 
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to  the  way  the  EXODUS  storage  manager  works  (5).  The  kernel  provides  operations  to 
read  an  object  (similar  to  EXODUS’s  scan),  insert  an  object  (similar  to  EXODUS’s  in  ... 
new  construct),  and  delete  an  object  (similar  to  EXODUS’s  delete).  Like  the  EXODUS 
storage  manager,  the  DASDBS  kernel  provides  concurrency  control  capabilities. 

2.5.3  The  Verso  DBMS.  The  Verso  DBMS  (29)  is  a  relational  database  system 
that  stores  data  in  nested  form  (called  V-relations)  to  increase  query  processing  speed. 
The  Verso  system  consists  of  three  layers: 

1.  The  V-relation  level,  which  is  made  up  of  V-relations  and  their  schema 

2.  The  file  level,  which  is  made  up  of  the  index  and  the  physical  representations  of 
V-relations 

3.  The  lowest  level,  which  is  made  up  of  blocks 

This  is  similar  to  the  Triton  system,  in  that  the  V-relation  levelcorresponds  to  Triton’s 
system  catalogs,  the  file  level  corresponds  to  the  E  language  representation  of  relations, 
and  the  lowest  level  corresponds  to  the  EXODUS  storage  manager. 

The  Verso  system  utilizes  a  “filter”  (mapper)  for  on-the-fly  processing  of  tuples, 
which  can  perform  all  algebraic  operations  except  for  restructuring  actions.  The  filter  is 
implemented  as  a  finite  state  automaton  (ESA),  which  scans  the  V-relation  one  byte  at  a 
time.  The  advantage  of  using  such  a  filter  is  that  query  processing  is  much  faster  because  it 
is  mapped  to  a  very  low-level  representation  of  the  problem  space  on  a  dedicated  machine. 
The  disadvantage  is  that  this  low-level  representation  is  complex  and  difficult  to  grasp, 
making  the  filter  hard  to  modify  and  maintain. 

2. 6  EXOD  US  Extensible  Database  System 

The  creation  of  a  new  database  system  is  not  a  trivial  task.  As  with  any  major  soft¬ 
ware  system,  significant  time  is  spent  in  the  design,  implementation,  and  testing  phases. 
Because  of  the  need  for  the  development  of  new  database  systems  with  novel  capabilities, 
extensible  database  systems  have  been  designed  to  simplify  the  production  of  application- 
specific  DBMSs.  Two  such  systems  are  the  EXODUS  extensible  database  system,  devel- 
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oped  at  the  University  of  Wisconsin  (7),  and  the  GENESIS  system  (2).  The  EXODUS 
system  is  not  a  DBMS,  but  a  toolkit  that  provides  the  necessary  facilities  and  tools  to  aid 
in  the  development  of  new  DBMSs.  The  goal  of  the  EXODUS  system  is  the  provision  of 
extensibility  without  sacrificing  performance  (7:475). 

Because  of  its  power,  flexibility,  and  availability,  EXODUS  was  selected  to  aid  in 
the  development  of  the  Triton  system.  EXODUS  provides  generic  system  components  and 
furnishes  component  generators  to  aid  in  the  construction  of  DBMS-specific  components. 
When  neither  approach  is  possible,  EXODUS  provides  tools  to  aid  in  the  development  of 
the  component.  The  EXODUS  tools  used  in  the  development  of  Triton  include: 

•  the  storage  manager  (5,  6),  which  stores  the  physical  data  of  the  database  and  pro¬ 
vides  access  to  the  data  via  procedural  calls 

•  the  E  persistent  programming  language  and  its  compiler  (25,  24) 

2.6.1  EXODUS  Tools  Used  in  the  Production  of  the  Triton  Nested  Database  System. 
The  architecture  of  the  Triton  system  is  given  in  Figure  2.4.  At  the  present  time,  Triton’s 
parser  (28)  is  able  to  parse  all  possible  SQL/NF  statements,  but  only  builds  query  trees  for 
the  statements  that  create  and  delete  items  in  the  system  catalogs,  as  well  as  the  statements 
that  directly  translate  into  the  appropriate  algebraic  operations  of  select,  project,  and 
Cartesian  product.  Triton’s  parser  component  was  implemented  using  the  UNIX  tools  of 
YACC  and  LEX. 

The  query  tree  built  by  the  parser  is  passed  to  the  rule-based  optimizer  where  it  is 
changed  to  a  plan  tree.  In  the  optimizer,  the  sequence  and  contents  of  the  nodes  in  the 
query  tree  are  changed  into  an  optimized  plan  tree  and  relational  operators  are  replaced 
with  specific  operator  methods.  At  the  present  time,  Triton’s  optimizer  component  has  not 
been  developed.  However,  the  intention  is  to  use  the  EXODUS  optimizer  generator  (13, 12) 
to  generate  this  component.  The  EXODUS  optimizer  generator  takes  as  input  (1)  a  set  of 
operators,  (2)  a  set  of  methods  that  implement  the  operators,  (3)  transformation  rules  that 
describe  equivalence-preserving  transformations  of  query  trees,  and  (4)  implementation 
rules  that  describe  how  to  replace  an  operator  with  a  specific  method.  Using  these  rules, 
a  specific  optimizer  is  generated  for  the  particular  application. 
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Figure  2.4.  The  Triton  Nested  Database  System  Architecture  Using  EXODUS 

We  chose  to  use  the  EXODUS  optimizer  generator  for  Triton  because  the  relational 
algebra  used  by  Triton  lends  itself  to  EXODUS’  rule-based  method.  This  modular  approach 
to  database  development  will  reduce  the  amount  of  code  required  for  implementation  of 
the  Triton  system.  The  only  unique  code  Triton’s  developers  will  need  to  write  will  be  the 
additional  functions  that  are  called  by  the  optimizer  when  implementing  a  specific  operator 
or  access  method.  With  accurate  cost  functions,  we  anticipate  the  generated  optimizer  will 
work  as  well  as  a  custom  built  one. 

Using  the  programming  constructs  provided  by  the  E  programming  language,  the  E 
code  generator  is  designed  to  traverse  the  plan  tree  and  generate  the  E  code  to  effect  the 
operator  and  access  methods  to  perform  the  query.  The  EXODUS  E  compiler  is  used  to 
link  the  methods  with  the  E  code  that  specifies  the  relation  definitions. 

The  storage  manager  is  a  fixed  component  provided  by  EXODUS.  Conceptually,  the 
storage  manager  is  the  layer  between  the  access  and  operator  methods  and  the  physical 
data  in  the  database.  The  storage  manager  is  accessed  via  procedural  calls  in  the  compiled 
query  which  allow  the  creation,  destruction,  and  iteration  through  the  contents  of  database 
files.  Objects  can  be  inserted  in  and  deleted  from  a  file  at  any  offset  in  the  file,  and  explicit 
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clustering  of  objects  on  disk  can  be  specified.  The  storage  manager  provides  procedures 
for  transaction  management  as  well  as  versioning  of  objects. 

2. 7  Summary 

The  goal  of  this  chapter  was  to  familiarize  the  reader  with  several  key  topics  re¬ 
lated  to  this  research.  This  chapter  introduced  the  nested  relational  model  and  described 
its  historical  development.  Notable  advances  in  nested  relational  theory  relevant  to  the 
Triton  system  were  described,  including  a  brief  survey  of  nested  relational  database  imple¬ 
mentations.  Finally,  the  EXODUS  extensible  database  system  was  described,  focusing  on 
how  the  EXODUS  toolkit  was  used  to  develop  the  Triton  system.  The  following  chapter 
describes  in  more  detail  how  EXODUS  is  used  to  develop  the  access  and  operator  methods 
for  the  Triton  system. 


2-13 


III.  Design  and  Implementation 


3.1  Overview 

The  goal  of  this  research  is  to  design  and  implement  operator  and  access  methods 
for  the  Triton  nested  relational  database  system  to  efficiently  process  queries  on  nested 
data.  First,  operator  methods  are  developed  to  implement  the  project,  select,  and  join 
operators  in  the  Colby  relational  algebra.  Second,  access  methods  are  implemented  to  add 
a  relation,  delete  a  relation,  store  tuples,  delete  tuples,  and  modify  tuples.  The  methods 
are  implemented  by  the  E  code  generator  called  codegen.  Figure  3.1  shows  how  the  E  code 
generator  fits  into  the  backend  of  the  prototype  database. 

The  optimizer  produces  a  plan  tree,  which  structurally  represents  the  database  user’s 
query.  The  nodes  of  the  plan  tree  contain  specific  operator  and  access  methods.  The 
program  codegen  traverses  the  plan  tree  in  postorder  and,  through  the  use  of  templates, 
generates  the  code  required  to  implement  the  query.  This  code  is  written  to  a  file  called 
query. e,  which,  when  executed,  performs  the  query.  A  unique  query. e  file  is  dynamically 
created  for  each  user  query.  If  a  relation  is  added  to  or  deleted  from  the  database,  codegen 
does  not  create  the  file  query. e,  but  directly  performs  the  necessary  actions  to  add  or  delete 
the  relation. 

The  remainder  of  this  chapter  is  divided  into  three  sections.  The  first  section  de¬ 
scribes  how  nested  relations  are  represented  in  Triton.  The  second  section  describes  Tri¬ 
ton’s  operator  and  access  methods.  Finally,  the  third  section  discusses  the  design  of  the  E 
code  generator  that  implements  the  methods. 


result 
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Figure  3.1.  Prototype  Nested  Database  Backend 


3.2  Representation  of  Nested  Relations  in  Triton 

Because  of  its  central  importance  to  the  Triton  system,  it  is  important  to  know  how 
nested  relations  are  represented  in  E.  A  brief  discussion  of  the  declaration  and  representa¬ 
tion  of  both  non-nested  and  nested  relations  in  the  E  programming  language  follows. 

3.2.1  E  Representation  of  Non-Nested  Relations.  Figure  3.3  shows  the  relation 
specification  for  the  non-nested  children  relation  of  Figure  3.2.  A  tuple  of  the  children 
relation  is  of  type  child.  The  attributes  are  specified  before  the  keyword  public.  Following 
the  keyword  public  are  the  constructor  and  member  functions.  The  constructor  function 
takes  as  input  three  character  pointers  and  an  integer  to  initialize  a  tuple  in  the  children 
relation.  The  get.emp.ssn ,  get.child.name ,  get.child.age,  and  get.sex  member  functions 
return  the  value  of  the  corresponding  attribute.  The  change.emp.ssn,  change.child.name, 
change.child.age,  and  change.child.sex  member  functions  enable  the  values  of  the  at¬ 
tributes  to  be  changed.  The  print  member  function  takes  a  pointer  to  a  child  tuple  and 
prints  out  the  values  of  the  attributes.  The  implementation  of  the  member  functions  are 
not  given  here.  The  line 

dbclass  child_relation:  collection  [child]; 
specifies  that  a  child-relation  type  consists  of  a  collection  of  tuples  of  type  child.  The  line 
persistent  child_relation  children; 
declares  a  persistent  relation  called  children,  which  is  of  type  child-relation. 

3.2.2  E  Representation  of  Nested  Relations.  The  E  programming  language  sup¬ 
ports  the  implementation  of  nested  relations  through  the  use  of  collections  for  relation¬ 
valued  attributes.  Figure  3.5  shows  the  E  language  relation  specification  for  the  em¬ 
ployee  relation  of  Figure  3.4.  The  emp  type  contains  four  atomic  attributes  called  dept, 
emp.name,  emp.age,  and  emp.ssn  as  well  as  one  relation- valued  attribute  called  children, 
which  is  a  collection  of  tuples  of  type  child.  As  a  requirement,  only  atomic  attributes  are 
specified  in  the  constructor  and  member  functions.  The  employee  relation  is  a  persistent 
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Figure  3.2.  The  Children  Relation 


object  of  type  empjrelation  made  up  of  a  collection  of  tuples  of  type  emp.  Any  level  of 
nesting  can  be  represented  in  the  E  programming  language  in  a  similar  fashion. 


dbstruct  child  { 

dbchar  emp_ssn[12j ; 
dbchar  child.name [32] ; 
dbint  child.age; 
dbchar  sex [2] ; 
public : 

child  (char  *,  char  *,  int,  char  *) ; 

char  *  get_cmp_ssn() ; 

void  change_emp_ssn  (char  ») ; 

char  *  get_child_name() ; 

void  change.child.name  (char  *) ; 

int  got_child_age() ; 

void  change_child_age  (int) ; 

char  *  get.sexO; 

void  change.sex  (char  *) ; 

void  print  (child  *) ; 

>; 

dbclass  child.relation:  collection  [child]; 
persistent  child.r elation  children; 


Figure  3.3.  E  Specification  of  the  Children  Relation 
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Figure  3.4.  Tlie  Nested  Employee  Relation 


dbstruct  child  { 

dbchar  child. name [32] ; 
dbint  child. age; 
dbchar  sex [2] ; 
public: 

child  (char  *,  int,  char  *) ; 

char  ♦  got.child.nameO; 

void  change.child.namo  (char  *) ; 

int  get.child.ageO; 

void  change.child.age  (int); 

char  *  got.sexO ; 

void  change.sex  (char  *) ; 

void  print  (child  *) ; 

>; 

dbstruct  emp  { 

dbchar  dept [20] ; 
dbchar  emp.name [32] ; 
dbint  emp.ago; 
dbchar  emp_ssn[12] ; 
dbclass  childRVA :  collection  [child] ; 
childRVA  children; 
public : 

emp  (char  *,  char  »,  int,  char  *) ; 

char  *  get.deptO; 

void  change.dept  (char  *) ; 

char  *  get.emp.nameO ; 

void  chango.emp.naae  (char  *); 

int  get.emp.ageO ; 

void  change.emp.age  (int) ; 

char  *  get.emp.ssnO ; 

void  chango.emp.ssn  (char  *) ; 

void  print  (emp  *) ; 

/; 


dbclass  emp..r elation:  collection  [emp]; 
persistent  emp.rolation  employee; 


Figure  3.5.  E  Specification  of  the  Employee  Relation 


3.3  Design  and  Function  of  Methods 

In  this  research  effort,  I  designed  and  implemented  two  operator  methods.  The  files- 
can  method  can  perform  a  select,  a  project,  or  both  a  select  and  project.  The  loops.join 
method  implements  the  join  operator  and  can  perform  a  select  and/or  project  in  conjunc¬ 
tion  with  the  join.  In  addition,  five  methods  were  implemented  that  (1)  create  a  relation, 
(2)  delete  a  relation,  (3)  add  tuples  to  a  relation,  (4)  modify  tuples  in  a  relation,  and 
(5)  dehte  tuples  from  a  relation.  The  methods  that  create  and  delete  a  relation  are  not 
generated  by  codegen  and  written  to  the  query. e  file.  Instead,  codegen  directly  performs 
the  operations  necessary  to  create  or  delete  the  relation.  Because  of  this,  the  create  re¬ 
lation  and  delete  relation  methods  are  not  described  in  this  section,  but  are  explained  in 
Section  3.4. 

The  design  and  implementation  of  my  operator  and  access  methods  utilize  several 
data  structures.  Because  these  structures  play  a  vital  role  in  understanding  the  design  of 
Triton’s  operator  and  access  methods,  Appendix  B  explains  each  one  in  detail. 

3.3.1  The  Filescan  Method.  The  filescan  method  is  implemented  as  an  iterator 
function  (25)  that  yields  tuples  to  the  iterate  statement  with  the  schema  defined  by  the 
projection  list.  If  a  projection  is  taking  place,  the  filescan  method  first  defines  a  tempo¬ 
rary  nested  relational  schema  to  hold  a  tuple  of  the  projected  input  relation.  Once  this 
temporary  relation  template  has  been  defined,  filescan  iterates  through  each  tuple  of  the 
input  relation  and  determines  if  the  tuple  meets  the  selection  condition,  if  such  a  condition 
exists.  If  the  tuple  meets  the  selection  condition,  the  appropriate  attributes  are  copied 
into  the  temporary  relation  template.  A  pointer  to  this  temporary  relation  tuple  is  yielded 
(returned)  to  the  calling  procedure. 

3.3. 1.1  Plan  Tree  Structural  Represe  lation  of  a  Filescan.  Using  the  nested 
relational  schema  for  the  employees  nested  relation  contained  in  Appendix  C,  the  following 
example  illustrates  how  the  filescan  method  works.  Assume  we  have  a  requirement  to  do 
a  project  and  a  select  on  the  employees  relation.  The  attributes  to  be  projected  out  from 
employees  are  name  and  age,  and  from  the  children  relation- valued  attribute  of  employees, 
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the  name  and  age  attributes  must  be  projected.  Prom  the  toys  relation-valued  attribute 
of  children ,  the  name  attribute  must  be  projected.  The  selection  condition  extracts  only 
those  tuples  where  the  employees’  age  is  over  30  and  the  children’s  age  is  less  than  5. 

The  Colby  algebra  equivalent  of  this  query  is: 

7r  ((name,  age,  children  (name,  age,  toys  (name)))  o  (employeesase>3o  (childrenaae<s))) 

The  plan  tree  representation  of  this  query  is  shown  in  Figure  3.6.  Notice  that  since 
the  predicate,  age  <  30,  pertains  to  the  atomic  attributes  at  the  highest  level  of  the 
employees  relation,  the  predicate  tree  hangs  off  the  pred  portion  of  the  plan  node.  Since 
the  second  predicate,  age  <  5,  pertains  to  the  children  relation- valued  attribute,  the 
one  node  predicate  tree  hangs  off  the  cond  portion  of  the  children  list  node.  At  first,  I 
attempted  to  make  one  predicate  tree  that  specified  all  conditions  for  the  query  regardless 
of  what  level  of  nesting  they  pertained  to,  but  found  this  required  many  traversals  of  the 
entire  predicate  tree  to  extract  conditions  (one  traversal  for  each  level  of  nesting).  I  found 
it  was  more  efficient  (in  terms  of  time)  to  separate  the  predicate  into  smaller  predicate 
trees,  one  for  each  level  of  nesting.  This  way,  only  the  conditions  relevant  to  that  level  are 
present,  and  the  predicate  tree  is  not  unnecessarily  traversed  if  no  conditions  exist  for  that 
level. 

Each  projected  attribute  is  represented  by  a  list  node.  Relation-valued  attributes 
also  appear  as  list  nodes,  but  only  list  nodes  representing  relation-valued  attributes  can 
have  values  for  cond  and  sublist.  The  cond  field  of  the  list  node  for  the  relation-valued 
attribute  children  points  to  the  one  node  predicate  tree  representing  the  condition  age  <  5. 
Since  there  are  attributes  being  projected  from  the  relation-valued  attribute  children,  the 
sublist  field  of  the  lis4  node  representing  the  children  attribute  points  to  another  linked  list 
of  list  nodes  representing  those  projected  attributes  of  children.  I  thought  of  representing 
the  projected  attributes  at  all  levels  of  nesting  as  one  linked  list,  rather  than  using  sublists 
for  the  projected  attributes  of  relation- valued  attributes.  However,  the  use  of  sublists  give 
the  representation  of  nested  relations  a  recursive  appearance,  which  lends  itself  well  to 
the  use  of  recursion  in  the  construction  of  the  operator  and  access  methods  by  the  code 
generator  codegen. 
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Figure  3.6.  Example  Plan  Tree  for  Filescan  Query 

3.3. 1.2  How  the  Filescan  Method  Works.  The  code  that  implements  the  files- 
can  for  the  tree  discussed  above  is  generated  by  codegen  and  contained  in  the  file  query. e. 
The  contents  of  query,  e  for  this  plan  tree  are  given  in  Appendix  E.  The  code  between 
the  extern  statement  and  the  iterator  statement  is  the  temporary  relation  schema  that 
specifies  the  structure  of  a  relation  with  the  necessary  attributes  projected  out  from  the 
input  relation.  This  specification  also  includes  the  code  to  implement  the  constructor  and 
member  functions  of  the  temporary  relation. 

The  iterator  actually  performs  the  filescan.  The  filescan  iterator  is  set  up  as  a  series 
of  steps  for  each  level  of  nesting.  These  steps  are: 


1.  iterate  through  the  input  relation,  or  relation  ™ 


alued  attribute  of  the  input  relation, 


scanning  one  tuple  at  a  time 


2.  test  the  tuple  to  see  if  it  meets  the  selection  criterion,  if  one  exists  for  that  level  of 
nesting 
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3.  project  the  necessary  attributes  from  the  input  tuple  (by  using  the  get .  member  func¬ 
tions)  and  record  them  in  the  temporary  relation  schema  (by  using  the  constructor 
function) 

4.  set  up  necessary  reference  pointers  to  the  input  relation  and  to  the  temporary  relation 
if  there  is  another  level  of  nesting  in  the  input  relation 

At  the  outermost  level  of  nesting,  after  all  other  relation- valued  attributes  have  been 
processed,  a  pointer  to  the  temporary  tuple  is  returned  by  the  filescan  iterator.  The  main 
program  of  query. e  iterates  through  the  filescan  tuple-by-tuple  and  prints  out  the  values 
of  the  attributes. 

3.S.1.S  Rationale  Behind  Design  Decisions  for  the  Filescan  Method.  I  did 
not  initially  implement  the  filescan  method  as  an  iterator  function,  but  performed  the 
projection  and  selection  in  the  main  program  along  with  the  code  that  prints  the  result.  I 
decided  to  design  the  filescan  as  a  separate  function  to  make  th .  code  in  the  query,  e  file 
modular  and  less  confusing,  which  is  particularly  useful  when  the  query,  e  file  implements 
several  fdescans  in  one  query  (which  occurs  when  filescans  filter  tuples  to  a  join,  as  will  be 
discussed  in  Section  3.3.2). 

After  I  decided  to  make  the  filescan  a  separate  function,  I  designed  it  to  return 
the  resultant  relation  as  a  whole  after  the  processing  was  complete.  However,  it  is  much 
faster  to  filter  the  relation  a  tuple  at  a  time  if  further  processing  of  the  tuple  is  going 
to  take  place  (for  example,  if  the  filescan  is  filtering  tuples  to  a  join).  This  way,  the 
filescanned  relation  is  only  processed  once.  I  chose  to  implement  the  filescan  as  an  iterator 
function,  because  iterator  functions  are  controlled  looping  constructs  that  automatically 
step  through  a  sequence  of  items.  Since  each  tuple  of  the  filescanned  relation  needs  to 
be  processed  further  (for  a  join)  or  printed  (for  a  filescan  alone),  I  felt  that  designing  the 
filescan  method  as  an  iterator  function  would  elegantly  serve  my  needs. 

3.3.2  The  Loops. Join  Method.  The  loops.join  method  is  implemented  as  an  iterator 
function  that  yields  tuples  with  the  schema  defined  by  the  join.  The  loops.join  method 
can  implement  a  projection  and/or  a  selection  in  conjunction  with  the  join.  The  loops.join 
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method  first  defines  a  temporary  nested  relational  schema  to  hold  a  tuple  of  the  result  of 
the  join,  or  a  tuple  of  the  projected  result  of  the  join  (if  a  projection  is  also  taking  place). 
The  loops.join  method  iterates  through  each  tuple  of  the  left  input,  and  for  each  tuple, 
it  iterates  through  all  tuples  of  the  right  input.  If  the  tuples  meet  the  join  criterion,  the 
appropriate  attributes  of  each  are  copied  into  the  temporary  relation.  If  no  join  criterion 
is  specified,  the  loops.join  method  acts  as  a  cross  product.  A  pointer  to  the  temporary 
relation  tuple  is  returned  to  the  calling  procedure. 

3.3.2. 1  Plan  Tree  Structural  Representation  of  a  LoopsJoin.  To  illustrate 
how  the  loops.join  method  works,  the  nested  relation  employees  in  Appendix  C  and  the 
nested  relation  products  in  Appendix  D  are  used.  The  required  operation  is  to  join  the 
products  relation  with  the  toys  relation-valued  attribute  of  children,  where  children  is  a 
relation-valued  attribute  of  employees.  The  join  criterion  is  that  the  name  of  the  product 
must  be  the  same  as  the  name  of  the  toy.  Tc  make  this  example  more  realistic,  assume  a 
filescan  is  going  to  take  place  on  employees  that  projects  out  the  employee’s  name  as  well 
as  the  name  of  the  toy.  In  addition,  assume  a  filescan  is  going  to  take  place  on  products 
that  projects  out  the  product’s  name  and  all  the  information  on  the  manufacturer  of  the 
product. 

The  Colby  algebra  equivalent  of  this  query  is: 

M o  (tf  ((name,  children  (toys  (name)))  employees)  (children  (toys)),  n  ((name, 

manufacturer)  products)) 

where  6  is  toys.name  =  products.name 

t 

The  plan  tree  representation  of  this  query  is  shown  in  Figure  3.7.  The  two  inputs  to 
the  loops.join  node  are  filescan  nodes  that  each  have  attribute  lists  for  their  projections. 
The  second  attribute  of  the  products  filescan  is  a  relation-valued  attribute  but  does  not 
have  a  sublist,  hence  all  attributes  of  manufacturers  are  to  be  projected  out.  The  loops.join 
node  specifies  the  join  criterion,  which  hangs  off  the  pred  field  of  the  plan  node.  Notice 
that  the  constant.on.right  field  of  the  pred  node  is  false,  meaning  that  both  operands  in 
the  predicate  are  attribute  names  (and  not  constants). 
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Figure  3.7.  Example  Plan  Tree  for  Loops-Join  Query 
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On  the  loops.join  node,  the  Colby  join  path  is  specified  by  the  two  list  nodes,  children 
and  toys.  This  means  that  the  join  is  to  occur  between  the  toys  relation- valued  attribute 
and  the  products  relation.  While  the  Colby  algebra  enables  a  join  to  occur  within  a  nested 
relation- valued  attribute  of  the  first  relation,  the  join  can  only  occur  at  the  top  level  of  the 
second  relation.  Otherwise,  the  joined  schema  would  not  be  constructable. 

S.3.2.2  How  the  Loops.Join  Method  Works.  As  with  the  filescan  method, 
the  code  that  implements  the  loops.join  method  for  the  plan  tree  shown  in  Figure  3.7  is 
generated  by  codegen  and  contained  in  the  file  query. e.  Appendix  F  shows  the  contents 
of  query. e  for  this  plan  tree.  The  group  of  code  before  the  main  program  can  be  logically 
separated  into  three  sections: 

1.  the  templates  and  filescan  iterator  function  for  the  employees  relation 

2.  the  templates  and  filescan  iterator  function  for  the  products  relation 

3.  the  templates  and  loops.join  iterator  function  for  the  join  of  the  products  relation  to 
the  employees  relation 

The  first  two  sections  (which  implement  the  two  filescans)  operate  as  discussed  in  Sec¬ 
tion  3.3.1.  The  third  section  of  code  (which  implements  the  join)  is  similar  to  the  two 
filescan  sections,  but  has  some  notable  differences. 

First,  a  temporary  relation  structure  is  defined  that  holds  a  tuple  of  the  result  of  the 
join.  This  is  performed  the  same  way  as  in  the  filescan  method.  The  temp5  structure  is  the 
level  where  the  join  is  occuring.  Notice  that  the  name  attribute  of  toys  has  been  renamed 
to  toys.name  and  the  name  attribute  of  products  has  been  renamed  to  products.name.  The 
reason  for  the  renaming  is  that  attribute  names  must  be  unique  within  a  level  of  nesting. 
Since  the  joining  of  the  relations  would  put  two  occurances  of  name  at  one  level  of  nesting, 
these  two  attributes  are  given  a  unique  name  b>  codegen  by  combining  the  parent  relation 
name  with  the  attribute  name. 

The  loops.join  iterator  performs  the  join.  For  each  tuple  of  the  filescan  on  the  left, 
if  the  join  level  has  not  been  reached,  the  join  template  is  filled.  This  continues  until  the 
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join  level  is  reached.  Then  for  each  tuple  of  the  filescan  on  the  right,  if  the  join  condition 
is  satisfied,  the  join  template  is  filled  with  information  from  the  two  relations.  If  there  are 
any  relation- valued  attributes  present  in  the  join  template  at  this  level,  reference  pointers 
are  set  up  to  descend  a  level  of  nesting.  All  relation- valued  attributes  that  belong  to  the 
left  and  right  relations  are  filled.  Finally,  after  a  single  tuple  of  the  join  template  has  been 
filled,  a  pointer  to  this  tuple  is  yielded  (returned)  by  the  loops.join  iterator. 

The  main  program  of  query,  e  uses  the  loops.join  iterator  to  step  through  each  tuple 
of  the  joined  relation  and  prints  out  the  value  of  each  attribute. 

3.S.2.3  Rationale  Behind  Design  Decisions  for  the  Loops  Join  Method.  Be¬ 
cause  the  loops.join  method  may  feed  a  join  above  it,  I  also  implemented  this  method  as 
an  iterator  function.  The  alternative  is  to  return  the  joined  relation  after  processing  as  a 
whole;  however,  if  further  processing  is  taking  place  above  the  join,  it  is  more  efficient  in 
terms  of  time  to  perform  the  join  as  a  filter,  passing  tuples  one  at  a  time  to  the  node  above. 
I  designed  the  loops.join  method  to  carry  out  a  projection  and  selection  in  conjunction 
with  the  join  to  avoid  having  to  process  all  the  tuples  of  the  join  a  second  time. 

3.3.3  Store.Values  Method.  The  store.values  method  can  insert  any  number  of  tu¬ 
ples  into  a  nested  relation  at  any  level  of  nesting.  Condition  statements  are  used  to  locate 
the  correct  tuple  in  the  relation  where  the  new  values  are  to  be  stored.  The  code  generator 
creates  the  query,  e  file  which  contains  the  code  to  implement  the  store.values  method. 

As  an  example,  assume  the  employees  schema  as  defined  in  Appendix  C  is  used. 
Suppose  we  wish  to  add  a  new  project  to  the  employee  David  and  two  new  toys  to  his 
daughter,  Flo.  Figure  3.8  gives  the  plan  tree  representation  of  this  query.  The  leaves  of  the 
tree  are  values  that  are  to  be  inserted  into  the  relation.  At  the  leaf  level,  each  attribute 
of  the  relation  or  relation-valued  attribute  has,  by  definition,  a  corresponding  list  node 
containing  the  value  of  the  attribute  for  this  new  tuple.  This  value  is  contained  in  the 
value  field  of  the  attrdcsc  node.  If  a  relation- valued  attribute  is  being  into  the  relation,  it 
may  have  a  sublist  pointing  to  the  values  of  its  attributes.  A  null  sublist  indicates  that 
there  are  no  tuples  for  this  relation- valued  attribute. 
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Figure  3.8.  Example  Plan  Tree  For  Store-Values  Query 

The  contents  of  query. e  for  the  plan  tree  of  Figure  3.8  is  given  in  Appendix  G.  The 
method  scans  each  tuple  of  employees  until  David  is  reached.  Then  David’s  children  are 
scanned  until  Flo  is  reached,  where  two  new  toys  are  added.  Finally,  a  new  project  is 
added  to  David’s  projects  relation-valued  attribute. 


3.3.4  Modify  Method.  The  modify  method  can  modify  values  of  the  attributes  for 
a  nested  relation  at  any  level  of  nesting.  Condition  statements  are  used  to  locate  the  tuple 
in  the  relation  where  the  values  are  to  be  changed.  If  no  condition  is  specified,  the  atomic 
values  of  all  tuples  in  the  relation  or  relation-valued  attribute  will  be  changed  to  the  new 
value.  The  code  generator  codegen  creates  the  query. e  file  which  contains  the  code  to 
implement  the  modify  method. 

To  illustrate,  using  the  employees  schema  as  defined  in  Appendix  C,  suppose  that  the 
employee  David  is  working  on  a  project  named  “BNAD”  that  must  be  renamed  “TROY”. 
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Figure  3.9.  Example  Plan  Tree  For  Modify  Query 


Also  assume  we  wish  to  change  the  age  of  his  daughter,  Flo,  to  four,  and  all  her  toys  to  blue. 
Figure  3.9  gives  the  plan  tree  representation  of  this  query.  The  leaves  of  the  tree  are  new 
values  for  attributes  in  the  relation.  The  three  attributes  being  changed  (project  name, 
child  age,  and  toy  color)  are  represented  by  three  leaf  nodes.  The  conditions  represented 
by  the  predicate  trees  specify  whether  the  change  is  occurring  at  selected  tuples  or  to  all 
tuples  of  the  relation  or  relation- valued  attribute. 

The  contents  of  query. e  for  the  plan  tree  depicted  in  Figure  3.9  is  given  in  Ap¬ 
pendix  H.  The  method  scans  employees  until  David  is  reached.  Then  David’s  children  are 
scanned  until  Flo  is  reached.  At  this  point,  Flo’s  age  is  changed  to  four,  and  all  her  toys’ 
colors  are  changed  to  blue.  If  we  wanted  to  change  the  color  of  only  some  of  Flo’s  toys,  a 
predicate  tree  would  appear  on  the  toys  list  node  specifying  the  select  condition.  Finally, 
David’s  projects  arc  scanned  until  “BNAD”  is  found,  which  is  renamed  to  “TROY”. 
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Figure  3.10.  Example  Plan  Tree  For  Delete  Query 

3.S.5  Delete  Method.  The  delete  method  can  delete  tuples  from  a  nested  relation 
at  any  level  of  nesting.  Condition  statements  are  used  to  locate  the  level(s)  where  the 
deletion  is  to  occur.  The  code  generator  codegen  creates  the  query. e  file  which  contains 
the  code  to  implement  the  delete  method. 

To  illustrate,  using  the  employees  relation  schema  as  defined  in  Appendix  C,  suppose 
all  of  David’s  projects  and  all  of  his  daughter  Flo’s  toys  are  to  be  deleted.  Figure  3.10  gives 
the  plan  tree  representation  of  this  query.  Only  relation- valued  attributes  appear  in  the 
attribute  list.  A  list  node  with  a  condition  and  a  sublist  indicates  that  the  deletion  does  not 
occur  at  this  level,  but  at  some  sub-level.  If  a  list  node  has  a  condition  but  no  sublist,  only 
those  tuples  that  meet  the  selection  condition  are  deleted.  If  the  list  node  has  no  condition, 
then  all  its  tuples  are  deleted.  When  a  tuple  is  deleted  that  has  a  relation-vaiued  attribute 
in  its  schema,  all  tuples  in  its  relation-valued  attribute  must  be  deleted. 

Appendix  I  gives  the  contents  of  query. e  for  the  plan  tree  of  Figure  3.10.  The  method 
scans  employees  until  David  is  reached.  Then  all  of  David’s  children  are  scanned  until  Flo 
is  reached,  and  all  her  toys  are  deleted.  David’s  projects  are  then  deleted. 


3.4  Code  Generator 

The  code  generator  codegen  generates  a  unique  query. e  file  for  each  query  by  walking 
the  plan  tree.1  The  main  program  of  codegen  makes  a  function  call  to  traverse.planJree 
to  walk  the  plan  tree  and  generate  the  code.  After  this  is  e,  the  main  program  calls 
generate.main  which  generates  the  main  program  of  query. e. 

3.4.I  Rationale  Behind  Design  Decisions  for  the  Code  Generator.  The  end  purpose 
for  the  Triton  system  is  to  be  the  backend  for  non-standard  application  tools.  Application 
programs  make  calls  to  the  database  via  embedded  SQL/NF  statements.  When  it  came 
time  to  design  how  the  Triton  system  would  perform  queries,  two  approaches  were  possible. 
The  query  implementor  could  either  be  (1)  an  interpreter  that  manipulates  the  database 
directly,  or  (2)  a  code  generator  that  generates  the  code  to  implement  the  query. 

There  are  advantages  and  disadvantages  to  both  approaches.  The  interpreter  ap¬ 
proach  would  work  well  in  an  ad  hoc  query  environment,  since  manipulation  of  the  database 
is  performed  directly  by  the  interpreter.  The  code  generator  would  require  substantial  time 
to  compile  and  run  each  generated  query.  The  code  generator  approach  works  best  when 
specific  queries  are  known  ahead  of  time,  so  that  the  code  for  those  queries  is  already 
generated  and  compiled.  The  interpreter  approach  would  take  longer  in  this  type  of  envi¬ 
ronment,  since  the  query  must  be  analyzed  and  executed  on  the  fly.  Because  the  intended 
environment  for  the  Triton  system  does  not  require  an  ad  hoc  query  capability,  the  code 
generator  approach  was  chosen. 

3-4-2  System  Catalogs.  The  code  generator  uses  two  system  catalogs  (which  are 
stored  as  relations)  to  record  information  about  the  relations  in  the  database.  These  two 
catalogs,  developed  by  Mankus  (21)  are  called  Rel-table  and  Sym.table.  Figure  3.11  gives 
an  example  of  the  contents  of  ReLtable  to  hold  the  two  relations,  employees  and  products, 
which  were  used  in  the  description  of  the  loops.join  method  in  Section  3.3.2.  Figure  3.12 
shows  the  corresponding  contents  of  SymJable. 

1  However,  codegen  does  not  create  a  query. e  file  for  the  create  relation  and  delete  relation  access  methods. 
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Figure  3.11.  Contents  of  ReLtable 


ReLtable  lias  three  attributes,  rellndex ,  relName ,  and  relType.  The  name  of  the 
relation  is  recorded  in  the  relName  attribute.  The  rellndex  attribute  serves  as  an  index 
for  ReLtable ,  and  the  relType  attribute  records  the  relation’s  schema  and  is  a  number 
corresponding  to  the  index  of  Sym.table  where  the  schema  is  defined. 

Sym.table  has  seven  attributes  called  index ,  name,  /ewe/,  attr ,  numb ,  parent ,  and 
nest-table.  The  name  attribute  holds  the  name  of  the  schema  or  attribute.  Schema  names 
must  be  unique.  The  level  attribute  tells  whether  this  entry  is  a  schema  or  an  attribute. 
The  attr  attribute  describes  the  type  of  the  attribute.  Relation-valued  attributes  have  the 
value  PREV-DEFINED  for  attr,  while  schemas  have  ON-THE-FLY.  The  numb  attribute 
records  the  number  of  characters  if  the  type  is  CHAR,  the  number  of  bytes  if  the  type  is  INT 
or  FLOAT,  or  the  number  of  attributes  if  the  type  is  PREV-DEFINED  or  ON-THE-PLY. 
If  the  value  for  level  is  ATTRIBUTE,  the  parent  attribute  holds  the  index  number  of  the 
parent  schema.  If  the  value  for  attr  is  PREV-DEFINED,  the  nest-table  attribute  holds 
the  index  number  of  its  defining  schema. 

3.4.3  Operation  of  the  Code  Generator.  All  the  methods  except  for  the  loops.join 
method  is  represented  by  a  single  node  plan  tree.  However,  the  loops.join  method  is 
represented  by  a  multi-node  plan  tree  which  is  traversed  by  codegen  in  postorder.  Codegen 
uses  “statement  templates”  in  the  E  programming  language  to  create  the  query. e  files 
(described  in  the  previous  section)  corresponding  to  the  plan  tree.  Recursion  is  used 
whenever  a  sublist  is  encountered  in  the  plan  tree,  which  greatly  simplifies  the  actions  of 
codegen  and  allows  any  level  of  nesting  in  the  plan  tree.  Since  codegen  directly  performs 
the  actions  necessary  to  create  and  delete  relations,  these  are  described  in  more  detail 
below. 
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8.4.,?. 1  Operation  of  Code  Generator  for  Create.Rel.  When  codegen  encoun¬ 
ters  a  plan  node  where  the  method  is  “CREATEJIEL”,  it  creates  the  files  describing  the 
relation  definitions.  As  an  example,  Appendix  C  shows  the  files  that  describe  the  rela¬ 
tion  definitions  for  the  employees  relation.  In  addition,  codegen  creates  a  batch  file  called 
compileschemas  which,  when  executed,  will  compile  the  relation  definitions. 

3. 4-3.2  Operation  of  Code  Generator  for  Drop.Rel.  When  codegen  encounters 
a  plan  node  where  the  method  is  “DROP-REL”,  it  checks  both  Rel-table  and  SymJable 
to  see  if  the  relation  scheme  is  a.  defining  scheme  for  another  relation  or  if  it  is  used  within 
another  scheme  definition.  If  it  is  not  being  used,  codegen  deletes  the  contents  of  the 
relation,  removes  the  corresponding  scheme  definition  files,  and  deletes  the  scheme  from 


3-18 


the  symbol  table.  After  all  appropriate  schemas  have  been  deleted,  codegen  deletes  the 
relation  from  the  relation  table. 

S.5  Summary 

This  chapter  described  how  nested  relations  are  represented  in  Triton  using  the  con¬ 
structs  of  the  E  programming  language.  Following  this  description,  each  of  the  methods 
implemented  in  this  research  effort  were  discussed.  Finally,  the  chapter  concluded  with  a 
look  at  the  operation  of  Triton’s  code  generator.  Using  the  access  and  operator  methods 
presented  in  this  chapter,  the  following  chapter  evaluates  the  performance  of  the  nested  re¬ 
lational  model  versus  the  relational  model  and  discusses  the  advantages  and  disadvantages 
of  using  the  EXODUS  toolkit  in  the  development  of  the  Triton  system. 


IV.  Analysis  and  Evaluation 


4.1  Overview 

As  stated  in  Chapter  1,  the  Triton  system  is  intended  to  be  the  backend  storage 
component  for  non-standard  database  applications,  such  as  CAD,  CASE,  or  CIS  tools. 
The  first  application  for  the  Triton  system  is  the  representation  of  a  particular  CASE 
methodology,  the  USAF  IDEFo  Structured  Analysis  language  (22).  Morris  (23)  defined  a 
nested  relational  model  and  c  relational  model  representation  for  IDEFo.  As  a  result  of 
his  analysis,  he  speculated  that  the  speed  of  query  execution  for  the  NRM  representation 
would  be  faster  than  that  of  the  relational  representation.  The  goal  of  this  chapter  is  to 
assess  Morris’  speculation  by  implementing  both  NRM  and  relational  representations  in 
the  Triton  system,  and  comparing  query  performance  in  terms  of  query  generation  and 
execution  times.  We  found  that  in  both  areas,  the  nested  representation  outperformed  the 
relational  version  in  terms  of  speed,  particularly  in  code  generation  time.  Following  this 
comparison,  we  discuss  the  advantages  and  disadvantages  of  using  the  EXODUS  toolkit 
for  the  development  of  the  Triton  system. 

Jt.2  Comparison  of  the  NRM  and  Relational  Representations  of  the  IDEFo  Language  Data 

4.2.1  Schema  Definitions.  Using  SQL/NF  CREATE  TABLE  commands,  the  schema 
definitions  for  the  relational  and  nested  relational  representations  of  the  IDEFo  Language 
Data  are  given  in  Appendices  J  and  K,  respectively.  At  the  outermost  level,  only  one 
SQL/NI  CREATE  TABLE  command  is  needed  for  the  NRM  representation  since  all 
IDEFo  language  data  is  contained  in  one  table.  The  data  used  to  fill  these  tables  is  the 
same  data  used  by  Morris  (23). 

Because  the  interface  between  the  F  .^L/NF  front  end  and  the  backend  of  the  Triton 
system  is  not  yet  developed,  the  SQL/NF  CREATE  TABLE  commands  were  not  used  to 
generate  the  E  language  structures  (the  .e  and  .h  files)  that  describe  the  NRM  and  relational 
representations  of  the  IDEFo  language  data.  Instead,  the  system  catalogs  were  filled  with 
data  describing  the  schema  of  the  two  representations,  and  the  E  language  structures  of 
both  representations  were  automatically  generated  using  the  create.rel  method  described 
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in  Chapter  3.  Even  without  the  use  of  the  SQL/NF  front  end,  generation  of  the  E  language 
structures  was  fairly  straightforward. 

4.2.2  Queries  Used  in  this  Comparison.  Morris  identified  seven  actions  the  CASE 
application  tool  might  request,  from  the  database.  These  are: 

1.  Create  all  database  tables 

2.  Load  all  database  tables 

3.  Erase  all  database  tables 

4.  Extract  all  data  in  database 

5.  Extract  all  drawing  data  for  a  particular  sheet  (diagram) 

6.  Extract  data  dictionary  information  for  a  particular  activity 

7.  Extract  data  dictionary  information  for  a  particular  data  element 

The  last  three  activities  (queries)  in  this  list  were  selected  for  the  comparison  presented  in 
this  chapter,  since  the  majority  of  the  application  program’s  queries  will  probably  be  one 
of  these  three. 

Morris  (23)  presented  the  SQL/NF  descriptions  of  each  of  these  queries.  Accord¬ 
ing  to  his  analysis,  the  relational  representation  requires  eight  subqueries  to  extract  all 
drawing  data  for  a  particular  sheet,  whereas  only  one  query  is  needed  to  extract  the  same 
information  from  the  NRM  database.  In  the  same  fashion,  six  subqueries  are  required  in 
the  relational  version  as  compared  to  one  query  in  the  nested  version  to  extract  activity 
data  dictionary  information.  To  extract  data  element  data  dictionary  information,  twelve 
subqueries  are  required  in  the  relational  version  and  one  query  is  needed  in  the  nested 
version. 

4.2.3  Method  of  Comparison.  The  two  representations  were  compared  with  respect 
to  code  generation  time  and  query  execution  time.  Each  query  was  generated  and  executed 
twenty  times,  and  the  minimum,  maximum,  variance,  and  average  were  calculated  for  each 
of  these  twenty  runs.  All  reported  times  are  in  seconds,  and  all  runs  were  accomplished  on 
a  Sun  3  workstation  at  approximately  the  same  level  of  workload. 


4-2 


Table  4.1.  Code  Generation  Times  for  Relational  Drawing  Data  Query 


Subquery  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

3.06 

3.22 

0.002080 

3.092 

2 

2.36 

2.50 

0.000779 

2.390 

3 

3.00 

3.18 

0.002378 

3.021 

4 

3.00 

3.12 

0.001873 

3.021 

5 

3.00 

3.14 

0.000938 

3.017 

6 

2.36 

2.50 

0.000825 

2.384 

7 

3.00 

3.16 

0.001831 

3.029 

8 

4.28 

4.42 

0.001401 

4.307 

Total 

24.261 

Table  4.2.  Code  Generation  Times  for  Relational  Activity  Data  Dictionary  Query 


Sub  query  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

2.44 

2.58 

0.002521 

2.465 

2 

2.34 

2.48 

0.000922 

2.362 

3 

2.54 

2.88 

0.006496 

2.573 

4 

3.02 

3.16 

0.002152 

3.054 

5 

3.02 

3.14 

0.001301 

3.038 

6 

3.78 

3.94 

0.002294 

3.819 

Total 

17.311 

4.2.4  Comparison  of  Code  Generation  Times.  Table  4.1  gives  the  code  generation 
figures  for  the  relational  version  of  the  drawing  query.  These  are  the  figures  obtained  by 
running  codegen  to  generate  the  code  to  implement  the  drawing  query.  Note  that  eight 
subqueries  are  needed,  so  the  total  gives  the  average  total  query  generation  time  for  the 
entire  query.  In  the  same  fashion,  Tables  4.2  and  4.3  give  the  code  generation  figures  for  the 
relational  versions  of  the  activity  and  data  element  data  dictionary  queries,  respectively. 
For  the  nested  version,  since  no  subqueries  are  needed  to  extract  the  drawing  data,  activity 
data  dictionary  information  and  data  element  data  dictionary  information,  each  of  these 
three  retrievals  involve  only  one  query.  Table  4.4  gives  the  code  generation  figures  for  all 
three  queries.  The  low  variance  validates  the  fact  that  workload  is  about  the  same  across 


the  board  for  all  runs. 

Table  4.5  compares  the  code  generation  times  for  the  relational  version  against  the 


Table  4.3.  Code  Generation  Times  for  Relational  Data  Element  Data  Dictionary  Query 


Subquery  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

2.38 

2.54 

0.001906 

2.413 

2 

2.32 

2.54 

0.004762 

2.366 

3 

2.34 

2.50 

0.001057 

2.374 

4 

2.36 

2.58 

0.003667 

2.386 

5 

2.98 

3.10 

0.000733 

2.992 

6 

2.84 

3.00 

0.003225 

2.876 

7 

2.84 

2.98 

0.001806 

2.858 

8 

2.34 

2.46 

0.000711 

2.355 

9 

2.34 

2.46 

0.007368 

2.350 

2.94 

3.16 

0.002706 

3.033 

11 

2.92 

3.08 

0.002754 

2.968 

12 

3.02 

3.16 

0.000972 

3.034 

Total 

32.005 

Table  4.4.  Code  Generation  Times  for  Nested  Queries 


Query 

Min  Time 

Max  Time 

Variance 

Average  Time 

Drawing  Data 

7.40 

MAM 

0.006349 

7.475 

Activ  Data  Diet 

3.10 

3.26 

0.001704 

3.131 

Data  Elem  Data  Diet 

6.06 

6.22 

0.002748 

6.103 

nested  version  for  each  of  the  three  queries.  Note  that  for  each  query,  the  generation  time 
for  the  nested  version  takes  less  than  one  third  the  time  for  the  equivalent  query  in  the 
relational  version.  The  difference  in  code  generation  times  can  be  attributed  to  two  factors: 
one,  the  relational  version  requires  several  subqueries  to  extract  needed  information  and 
two,  the  increased  complexity  of  the  plan  trees  for  the  relational  versions  of  the  queries 
adversely  impacts  the  speed  of  the  code  generator.  Each  of  these  factors  is  examined,  in 
turn,  below. 

The  fact  that  there  are  many  subqueries  needed  in  the  relational  version  means 
that  many  functions  of  the  code  generator  must  be  duplicated  for  each  subquery,  such  as 
opening  the  output  file  to  hold  the  generated  query,  checking  to  see  if  the  main  program 
should  be  generated,  and,  if  it  is,  generation  of  the  main  program.  These  actions  may  seem 


Table  4.5.  Comparison  of  Code  Generation  Times 


Query 

Relational  Generation  Time 

Nested  Generation  Time 

Drawing  Data 

24.261 

7.475 

Activ  Data  Diet 

17.311 

3.131 

Data  Elem  Data  Diet 

32.005 

6.103 

trivial,  but  can  increase  the  run  time  of  the  code  generator  for  queries  made  up  of  many 
subqueries. 

Perhaps  the  most  significant  contribution  to  the  code  generation  time  is  the  com¬ 
plexity  of  the  query’s  plan  tree.  In  the  relational  version,  there  are  40  tables  that  make  up 
the  database,  as  compared  to  only  one  table  in  the  nested  version.  For  both  the  relational 
and  nested  versions,  each  table  involved  in  a  query  must  have  at  least  one  filescan  node  in 
the  plan  tree.  Thus,  queries  in  the  relational  version  are  represented  as  multi-node  plan 
trees,  with  filescans  for  each  table  in  the  query  and  join  nodes  connecting  them.  However, 
the  plan  trees  in  the  nested  version  of  the  queries  contain  only  one  filescan  node,  since 
all  information  is  selected  and  projected  from  one  table.  The  code  generator  sets  up  a 
separate  iterator  for  each  filescan  and  join  node  in  the  plan  tree.  Clearly,  for  multi-node 
plan  trees,  the  job  of  codegen  is  much  more  involved.  A  good  assumption  seems  to  be  that 
this  complexity  accounts  for  a  large  portion  of  the  higher  code  generation  times  for  the 
relational  version  as  compared  to  the  nested  version. 

The  fact  that  the  code  generator  always  checks  for  the  existence  of  sublist  pointers 
in  list  nodes  is  a  design  characteristic  that  hinders  the  relational  version,  since  all  sublist 
pointers  are  null.  However,  elimination  of  this  check  would  not  make  up  the  huge  difference 
in  code  generation  times. 

4-2.5  Comparison  of  Query  Execution  Times.  Table  4.6  presents  the  query  exection 
figures  for  the  relational  version  of  the  drawing  query.  Note  that  eight  subqueries  are 
needed,  so  the  total  gives  the  average  total  time  to  execute  the  entire  query.  In  the  same 
fashion,  Tables  4.7  and  4.8  present  the  query  execution  figures  for  the  relational  versions  of 
the  activity  and  data  element  data  dictionary  queries,  respectively.  Table  4.9  presents  the 


Table  4.6.  Execution  Times  for  Relational  Drawing  Data  Query 


Subquery  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

0.30 

0.003583 

2 

0.40 

0.004901 

0.172 

3 

1 

0.30 

0.003954 

4 

0.06 

0.18 

0.001536 

5 

0.04 

0.28 

0.003491 

0.098 

6 

0.04 

0.18 

0.002454 

0.083 

7 

0.04 

0.38 

0.006732 

0.165 

8 

0.04 

0.2u 

0.002173 

0.084 

Total 

0.660 

Table  4.7.  Execution  Times  for  Relational  Activity  Data  Dictionary  Query 


Subquery  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

0.10 

0.000500 

2 

|  | 

0.04 

0.000067 

3 

0.04 

0.14 

0.000854 

0.073 

4 

0.06 

0.54 

0.009663 

0.170 

5 

0.08 

0.32 

0.0C3074 

0.140 

6 

0.06 

0.22 

0.002754 

0.108 

Total 

0.702 

NRM  version  query  execution  figures  for  the  drawing  query,  activity  data  dictionary  query, 
and  the  data  element  data  dictionary  query.  Again,  the  low  variance  is  a  good  indication 
that  the  workload  was  the  same  across  the  board  for  all  runs. 

Table  4.10  compares  the  query  execution  times  for  the  relational  version  against  the 
nested  version  for  each  of  the  three  queries.  As  with  code  generation,  the  nested  query 
exection  time  is  less  than  the  relational  query  execution  time.  The  difference  in  query 
execution  time  can  be  attributed  to  two  factors:  one,  the  large  number  of  joins  in  the 
relational  version,  and  two,  increased  disk  access  time  due  to  a  lack  of  clustering  in  the 
relational  version. 

As  stated  earlier,  all  information  is  contained  in  one  table  in  the  nested  version, 
whereas  the  relational  version  breaks  the  IDEFo  language  data  into  40  normalized  tables. 
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Table  4.8.  Execution  Times  for  Relational  Data  Element  Data  Dictionary  Query 


Sub query  # 

Min  Time 

Max  Time 

Variance 

Average  Time 

1 

0.06 

0.20 

0.001915 

0.121 

2 

0.04 

0.22 

0.002504 

3 

0.04 

0.18 

0.001637 

4 

0.06 

0.26 

0.002580 

5 

0.04 

0.16 

0.001746 

0.081 

6 

0.06 

0.24 

0.002942 

0.095 

7 

0.06 

0.18 

0.001637 

0.095 

8 

0.04 

0.16 

0.001853 

0.080 

9 

0.04 

0.14 

0.001246 

0.084 

10 

0.08 

0.34 

0.003515 

0.119 

11 

0.12 

0.32 

0.002699 

0.194 

12 

0.14 

0.24 

0.000500 

0.145 

Total 

1.263 

Table  4.9.  Query  Execution  Times  for  Nested  Queries 


Query 

Min  Time 

Max  Time 

Variance 

Average  Time 

Drawing  Data 

0.48 

1.00 

0.024399 

0.551 

Activ  Data  Diet 

0.52 

0.86 

0.004037 

0.695 

Data  Elem  Data  Diet 

0.74 

0.92 

0.001367 

0.769 

Each  time  data  is  correlated  between  two  or  more  tables,  a  join  is  executed.  While  no 
joins  are  required  in  the  nested  version  (since  all  data  is  contained  in  one  table),  some 
of  the  relational  subqueries  require  as  many  as  four  joins.  As  stated  in  Chapter  3,  the 
join  method  implemented  is  a  nested  loops  join  method,  which  is  not  very  efficient.  For 
each  tuple  of  one  relation,  each  tuple  of  the  second  relation  is  checked  to  see  if  the  join 
criteria  are  met.  A  more  efficient  algorithm  such  as  a  sort-merge  join  or  a  hash  join  would 
probably  speed  up  the  execution  time  of  the  relational  version.  However,  even  with  the 
speedup  realized  with  a  more  efficient  join  algorithm,  the  execution  time  of  the  relational 
version  would  probably  still  exceed  that  of  the  nested  relational  version,  since  no  joins  are 
required  in  the  nested  relational  version.  The  difference  in  execution  times  is  not  as  great 
as  expected,  probably  due  to  the  limited  number  of  tuples  in  the  relational  tables.  As  the 
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Table  4.10.  Comparison  of  Query  Execution  Times 


Query 

Relational  Execution  Time 

Nested  Execution  Time 

Drawing  Data 

C.660 

0.551 

Activ  Data  Diet 

0.702 

0.695 

Data  Elem  Data  Diet 

1.263 

0.769 

size  of  the  database  grows,  the  execution  times  should  increasingly  favor  the  nested  version 
of  the  database. 

The  second  reason  for  the  faster  execution  of  the  nested  queries  is  that  disk  access 
times  are  longer  for  relational  queries  since  tables  are  not  automatically  clustered  on  disk. 
The  EXODUS  storage  manager  automatically  clusters  tuples  of  a  collection  on  disk  to 
decrease  access  time.  Of  course,  this  means  that  attributes  in  the  tuples  are  clustered, 
including  relation-valued  attributes.  Since  the  nested  version  has  only  one  overall  table 
(with  several  nested  tables),  all  data  in  the  nested  version  should  theoretically  be  clustered 
on  disk.  However,  in  the  relational  version,  no  explicit  clustering  is  specified.  The  EXODUS 
storage  manager  allows  “near”  hints  as  to  how  to  cluster  data,  but  none  are  specified  for 
the  relational  version.  If  these  are  used,  the  execution  time  for  the  relational  version  might 
realize  some  speedup. 

4.2.6  Summary  of  Comparison.  The  goal  of  this  comparison  was  to  demonstrate 
the  empirical  advantage  of  the  NRM  over  the  relational  model  for  the  storage  and  retrieval 
of  complex  data.  This  advantage  was  demonstrated  by  comparing  the  two  models  on  code 
generation  time  and  query  execution  time.  In  both  areas,  the  nested  version  outperformed 
the  relational  version,  particularly  in  code  generation  time.  However,  between  generation 
and  execution  time,  execution  time  is  most  important.  This  is  because  queries  will  be 
generated  once,  but  will  be  executed  many  times.  However,  the  performance  of  both 
activities  must  be  considered,  since  both  play  a  crucial  role  in  determining  the  processing 
speed  of  the  Triton  system. 
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4.3  Evaluation  of  the  EXODUS  Toolkit  in  this  Research  Effort 

Implementation  of  a  DBMS  is  not  a  trivial  undertaking,  but  the  use  of  extensible 
systems  has  eased  some  of  the  burden  of  this  task.  The  use  of  the  EXODUS  extensible 
toolkit  reduced  the  development  time  of  the  Triton  nested  relational  database  system. 
The  following  subsections  present  the  advantages  and  disadvantages  of  using  the  EXODUS 
toolkit  in  the  development  of  the  Triton  system  by  specifically  evaluating  the  following: 

•  collections 

•  classes 

•  persistence 

•  data  representation 

•  optimizer  generator 

•  storage  manager 

4.3.1  Collections.  The  Triton  system  is  built  on  the  nested  relational  data  model, 
allowing  relation- valued  attributes  in  relations.  The  nested  relational  data  model  is  mapped 
very  nicely  using  the  collection  generator  class  supplied  by  EXODUS  in  the  E  program¬ 
ming  language.  Relation-valued  attributes  are  represented  using  collections  of  collections. 
Unfortunately,  EXODUS  only  provides  the  capability  for  sequential  scanning  of  collections, 
making  access  via  a  search  key  slow  for  large  relations.  The  only  way  around  this  short¬ 
coming  is  to  build  indexes  on  every  frequently  accessed  or  sufficiently  large  relation  and 
nested  relation. 

The  E  programming  language  does  not  permit  subclasses  to  be  derived  from  the 
collection  class,  which  is  a  limitation.  Since  a  collection  is  an  unordered  group  of  objects, 
assume  we  want  to  derive  a  subclass  of  the  collection  class  called  ordered.collection ,  which 
is  an  ordered  group  of  objects.  E  does  not  allow  this  derivation  from  the  collection  class. 

A  good  performance  aspect  of  the  collection  class  is  that  items  in  a  collection  are 
clustered  on  disk,  including  subcollections.  This  automatic  clustering  reduces  access  time 
and  frees  the  implementor  of  having  to  worry  about  the  grouping  of  data  on  disk. 
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4-3.2  Classes.  The  use  of  classes  is  an  excellent  object-oriented  technique  for  hiding 
the  actual  representation  of  data  and  allowing  access  to  subobjects  through  predefined 
constructor  and  member  functions.  The  use  of  generator  classes  should  prove  very  useful 
in  the  development  of  special  indexing  techniques  for  the  Triton  system.  This  is  because  a 
generic  index  (such  as  B+-tree)  can  be  developed  using  a  generator  class  which  takes  the 
type  as  a  parameterized  input.  A  specific  index  is  then  instantiated  for  a  particular  type 
using  the  generator  class  specification. 

4-3.3  Persistence.  In  E,  persistent  objects  are  declared  explicitly  by  using  the  key¬ 
word  persistent  before  the  declaration  of  the  object.  However,  only  db  types  can  be 
declared  persistent,  since  the  use  of  a  db  type  specifies  that  the  object  is  a  disk  object 
as  opposed  to  a  main  memory  object.  Unfortunately,  every  subcomponent  of  a  dbclass 
must  be  a  db  type,  which  can  cause  some  problems  if  the  database  implementor  chooses  to 
change  a  main  memory  class  to  a  persistent  class;  not  only  does  the  class  have  to  be  changed 
to  a  dbclass,  but  all  subobjects  in  the  class  must  be  changed  to  their  db  counterparts. 

4.3.4  Data  Representation.  As  stated  in  Section  4.3.1,  nested  relations  are  conve¬ 
niently  mapped  using  the  E  collection  class  by  mapping  them  directly  into  the  underlying 
programming  language.  This  allows  the  relation  definitions  (the  *.e  and  *.h  files)  to  be 
compiled  into  object  files  using  the  E  compiler.  Compiled  queries  are  linked  with  the 
object  files  containing  the  relation  definitions  and  executed  to  perform  the  query.  The 
advantage  of  this  method  is  that  queries  are  executed  quickly,  since  they  are  in  machine 
code.  However,  the  long  compilation  and  linking  time  makes  this  method  prohibitive  for 
dealing  with  ad  hoc  queries. 

4.3.5  Optimizer  Generator.  At  the  present  time,  Triton’s  optimizer  component  has 
not  been  developed.  However,  the  intention  is  to  use  the  EXODUS  optimizer  generator  to 
generate  this  component.  The  EXODUS  optimizer  generator  takes  as  input  (1)  a  set  of 
operators,  (2)  a  set  of  methods  that  implement  the  operators,  (3)  transformation  rules  that 
describe  equivalence-preserving  transformations  of  query  trees,  and  (4)  implementation 
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rules  that  describe  how  to  replace  an  operator  with  a  specific  method.  Using  these  rules, 
a  specific  optimizer  is  generated  for  the  particular  application. 

We  chose  to  use  the  EXODUS  optimizer  generator  for  Triton  because  the  relational 
algebra  used  by  the  system  lends  itself  to  EXODUS’  rule-based  method.  This  modular 
approach  to  database  development  will  reduce  the  amount  of  code  required  for  implemen¬ 
tation  of  the  Triton  system.  The  only  unique  code  Triton’s  developers  will  need  to  write 
will  be  the  additional  functions  that  are  called  by  the  optimizer  when  implementing  a 
specific  operator  or  access  method.  For  example,  take  the  following  implementation  rule: 

project  (loops_join  (1,2))  by  loops.join  (1,2)  combine_ljp; 

This  means  that  if  there  is  a  loops.join  followed  by  a  projection,  it  can  become  a  special 
case  of  loops-join  that  will  perform  the  join  and  the  projection  in  the  same  step.  To  do 
this,  the  function  combineJjp  is  called  to  combine  the  projection  list  and  the  argument  list 
of  the  loops.join.  The  function  combineJjp  is  an  example  of  the  code  that  must  be  written 
by  the  Triton  system  developers.  However,  this  is  far  less  involved  than  developing  an 
optimizer  from  scratch.  With  the  use  of  accurate  cost  functions,  the  geneiated  optimizer 
should  perform  as  well  as  a  custom-built  one. 

j.3.6  Storage  Manager.  The  use  of  the  EXODUS  storage  manager  greatly  reduced 
the  development  time  of  the  Triton  system  by  freeing  the  developer  from  worrying  about 
Triton’s  storage  component.  The  storage  manager’s  procedural  interface  allows  access  to 
database  objects  without  having  to  know  how  they  are  actually  being  manipulated.  As 
an  example,  persistent  objects  are  automatically  mapped  to  permanent  storage  locations 
through  the  use  of  the  persistent  keyword.  Data  clustering  is  performed  by  using  the 
collection  class  and  is  transparent  to  the  developer.  Access  to  collections  of  objects  is  pro¬ 
vided  by  the  E  programming  language  via  built  in  procedure  calls  to  the  storage  manager, 
such  as  scan,  in  ...  new,  and  delete. 

4.3.7  Overall  Evaluation  of  the  EXODUS  Toolkit.  EXODUS  was  very  helpful  in  the 
development  of  the  Triton  system.  In  particular,  the  E  programming  language  elegantly 
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maps  nested  relations  using  the  collection  class.  While  EXODUS  only  supplies  sequential 
access  to  collections,  implementation  of  indices  should  be  simple  using  generator  classes. 
The  storage  manager  handles  persistence  and  automatically  clusters  nested  relations  on 
disk.  In  addition,  the  storage  manager’s  procedural  interface  made  interaction  with  the 
storage  manager  virtually  transparent.  Use  of  the  EXODUS  optimizer  generator  should 
greatly  reduce  the  time  it  takes  to  create  Triton’s  optimizer. 

4.4  Summary 

The  first  half  of  this  chapter  demonstrated  the  viability  of  the  nested  model  by 
compaiing  the  performance  of  the  nested  and  relational  representations  of  IDEFo  language 
data  based  on  code  generation  time  and  query  exection  time.  In  both  areas,  the  nested 
version  performed  faster.  The  second  half  of  this  chapter  discussed  the  advantages  and 
disadvantages  of  using  the  EXODUS  toolkit  in  the  developement  of  the  Triton  nested 
relational  database  system. 
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V.  Conclusions  and  Recommendations 


5.1  Ovemiew 

This  chapter  surrun  arizes  and  draws  conclusions  about  the  work  presented  in  this 
thesis.  Recommendations  for  further  work  as  it  pertains  to  the  Triton  nested  database 
system  is  also  discussed. 

5.2  Summary  of  Research 

The  goal  of  the  research  presented  in  this  thesis  was  to  develop  a  backend  to  the 
Triton  nested  database  system  to  manipulate  the  data  in  the  database  as  specified  by 
some  query.  The  goal  was  not  to  provide  a  plethora  of  access  methods,  but  to  give  Triton 
the  ability  to  process  queries  of  any  type  (except  for  data  restructuring  queries)  and  to  lay 
the  foundation  for  future  work  in  this  area. 

The  work  accomplished  in  this  research  effort  may  be  summarized  as  follows: 

1.  Design  and  implementation  of  the  operator  methods  of  the  Colby  algebra  to  handle 
multiple  levels  of  nesting,  including: 

•  The  Jilescan  method  that  implements  a  project  and/or  a  select 

•  The  loops.join  method  that  implements  a  join  as  well  as  a  project  and/or  select 

2.  Design  and  implementation  of  the  access  methods  to  modify  data  in  the  database, 
including: 

•  The  store-values  method  that  adds  data  to  a  relation  at  any  level  or  levels  of 
nesting 

•  The  modify  method  that  modifies  data  in  a  relation  at  any  level  or  levels  of 
nesting 

•  The  delete  method  that  deletes  data  from  a  relation  at  any  level  or  levels  of 
nesting 

•  The  crealejrel  method  that  adds  a  new  relation  to  the  database 
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•  The  drop-rel  method  that  deletes  a  relation  from  the  database 


3.  Design  and  implementation  of  the  code  generator,  codegen,  to  implement  the  operator 
and  access  methods 

4.  Testing  of  methods  and  code  generator 

5.  Comparison  of  the  performance  of  the  nested  relational  model  versus  its  normalized 
(INF)  version  using  the  operator  methods  and  code  generator: 

•  Development  of  an  E  representation  of  a  sample  set  of  relational  IDEFo  language 
data  (23) 

•  Development  of  an  E  representation  of  nested  relational  IDEFo  language  data  (23) 

•  Design  of  programs  to  load  data  into  both  representations  of  the  IDEFo  language 
data 

•  Creation  of  queries  for  both  representations  to  evaluate  performance  of  the  NRM 
against  the  traditional  relational  model 

•  Comparison  of  the  NRM  against  the  relational  model  based  on  code  generation 
time  and  query  execution  time 

6.  Discussion  of  the  advantages  and  disadvantages  of  using  the  EXODUS  toolkit  in  this 
research  effort 

5.3  Conclusions 

By  using  Triton  to  store  a  nested  and  relational  representation  of  CASE  data  and 
using  the  methods  implemented  in  this  research  to  query  that  data,  we  found  that  the 
nested  relational  model  outperforms  the  relational  model  in  terms  of  code  generation  time 
and  query  execution  time.  These  results  demonstrate  the  viability  of  the  nested  relational 
model  for  the  representation  of  complex  data.  In  addition,  the  EXODUS  toolkit  provided 
several  necessary  tools  that  greatly  reduced  the  development  time  of  the  Triton  system. 
However,  the  use  of  the  EXODUS  toolkit  did  constrain  this  development,  and  future  efforts 
must  attempt  to  overcome  these  constraints.  With  this  in  mind,  the  following  section 
recommends  possible  areas  for  improvement  in  the  Triton  system. 
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5.4  Recommendations 

As  stated  earlier,  the  purpose  for  research  presented  in  this  thesis  was  to  give  Triton 
the  ability  to  process  queries  of  any  type  (except  for  data  restructuring  queries)  and  to 
lay  the  foundation  for  future  work  in  this  area.  Actions  must  be  taken  to  bring  the  Triton 
query  processing  system  up  to  full  capability.  These  actions  include: 

•  Design  of  the  nest  and  unnest  operator  methods  to  implement  the  data  restructuring 
actions  of  the  Colby  algebra 

•  Generation  of  the  optimizer  to  properly  optimize  the  query  tree  and  call  appropriate 
methods 

Additional  actions  need  to  be  taken  to  give  Triton  more  than  just  a  cursory  set  of 
operator  and  access  methods.  This  includes: 

•  Design  of  additional  operator  methods,  such  as  merge-join,  or  index-join  to  more 
efficiently  handle  the  join  operator 

•  Development  of  indexing  techniques  and  access  methods  to  use  them: 

-  I  recommend  looking  at  Bertino  and  Kim’s  work  in  this  area  (3),  but  especially 
recommend  implementing  a  similar  indexing  strategy  as  the  one  used  by  the 
ANDA  system  (11) 

As  a  performance  issue,  sequential  access  to  the  system  catalogs  is  inefficient.  I 
suggest  either  establishing  indices  intc  the  catalogs,  or  implementing  schema  information 
as  persistent  structures  in  main  memory,  as  is  done  in  the  Ariel  system  (14).  This  would 
speed  up  query  execution  substantially,  especially  as  the  size  of  the  system  catalogs  grows. 

I  recommend  that  future  work  address  the  possibility  of  making  a  query  interpreter  in 
addition  to  the  code  generator.  This  would  make  testing  easier  on  application  developers. 
The  interpreter  would  have  to  be  linked  with  the  schema  definition  files  for  all  relations  in 
the  database.  If  such  an  effort  is  undertaken,  the  developer  should  consider  implementing 
the  methods  as  a  class  hierarchy  of  plan  objects  as  is  done  in  Ariel  (14). 
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Appendix  A.  Colby  Relational  Algebra 


The  use  of  the  nested  relational  model  requires  the  development  of  operators  to 
manipulate  nested  data.  Latha  S.  Colby  (9)  devised  a  recursive  algebra  for  nested  relations 
that  builds  on  the  traditional  relational  algebra  operators.  Her  operators  allow  the  retrieval 
of  information  from  any  level  of  nesting  in  a  relation  without  first  “flattening  out”  the 
relation  to  fit  ‘ha  standard  relational  model.  To  pe.mit  nesting,  Colby  employed  set 
operators  and  redefined  the  select  (cr),  project  (7r)  and  join  (M)  operates  of  the  relational 
model  and  introduced  two  new  operators,  nest  (u)  and  unnest  (g).  ror  the  purposes  of 
this  exposition,  I  assume  that  the  select,  project  and  join  operators  are  understood  as  they 
apply  to  non-nested  relations.1  Their  use  by  Colby  is  defined  below. 

A.l  Select  (cr). 

The  select  operator  is  used  to  extract  tuples  from  the  relation  lhat  satisfy  a  specilic 
selection  criteria.  Its  structure  is  defined  as 

cr  (relationcon<*, •*;(,„  ( select  list)) 

The  select  list  is  used  to  indentify  conditions  on  relation-valued  attributes  (RVAs).  The 
select  list  can  be  null  or  has  the  following  recursive  form: 

( IT/ A QQ’-iditioji  ( select  list)) 

For  example,  using  the  database  schema  provided  in  Figure  1.3,  a  query  to  find  all 
the  employees  who  are  older  than  35  that  have  male  children  is  written  as 

o  (employeee7np_a?c>35  (childrensej:=M)) 

which  yields  the  relation  shown  in  Figure  A.l. 

'For  a  definition  of  the  relational  algebra  operators  for  non-nested  relations,  see  (8). 
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Figure  A.2.  Project  Out  Employee  and  Children  Names 


A. 2  Project  (ir). 

The  project  operator  is  used  to  extract  specific  attributes  of  a  relation.  Its  structure 
is  defined  as 


it  (( project  list )  relation) 

where  a  project  list  indentifies  the  list  of  attributes  to  be  extracted  from  the  relation.  If 
the  project  list  contains  any  relation-valued  attributes,  those  attributes  may  have  their 
own  project  list.  For  example,  a  qu  ry  to  retrieve  all  employees’  names  and  their  childrens’ 
names  from  the  employee  relation  s  written  as 

7 r  ((emp_name,  children  (chiid_name))  employee) 

which  yields  the  relation  shown  in  Figure  A.2. 
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Figure  A. 3.  The  Student  Relation 


A. 3  Join  (M). 

The  join  operator  joins  a  relation  to  another  relation  or  to  any  relation  valued  at¬ 
tribute  (RVA)  of  a  relation.  Its  structure  is  as  follows: 

(relatioiii  ( join  path),  relation) 

The  condition  is  specified  in  the  9  portion  of  the  query.  If  the  join  path  is  null,  the  two 
relations  will  be  joined  at  the  highest  level.  However,  the  join  path  may  identify  the  level 
of  relationi  where  the  join  is  to  take  place.  The  join  path  is  defined  as: 

(RVA  of  relationi  ( join  path  of  RVA  of  relationi)) 

For  example,  suppose  there  is  a  separate  student  relation  as  shown  in  Figure  A.3  in 
addition  to  the  nested  employee  relation.  The  query  to  join  the  children  attribute  of  the 
employee  relation  to  the  student  relation  where  children  name  is  equal  to  the  student  name 
is 


Mchiid.namc=name  (employee  (children),  student) 
which  yields  the  relatioi  shown  in  Figure  A.4. 
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Figure  A.4.  The  Employee  Relation  Joined  (on  Children)  to  the  Student  Relation 
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Figure  A. 5.  The  Flat  Employee  Relation 


A. 4  Nest  (n). 

The  nest  operator  is  a  restructuring  operator  that  groups  certain  specified  attributes  of  a 
relation  into  a  nested  version  of  the  relation.  Its  structure  is 

v  attribute  list  — >  RVA  name  (relation) 

where  the  attribute  list  specifies  the  attributes  to  be  nested  under  the  title  of  the  RVA  name. 

For  example,  a  query  to  take  the  flattened  version  of  the  employee  relation  as  shown  in 
Figure  A. 5  and  nest  child-name,  child-age  and  sex  into  a  relation  valued  attribute  called  children 
would  be  written  as 


v  childmame,  child_age,  sex  — *•  children  (employee) 
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Figure  A.6.  The  Nested  Employee  Relation 


which  creates  the  nested  version  of  the  employee  relation  as  shown  in  Figure  A. 6. 

A. 5  Unnest  (p). 

The  unnest  operator  is  the  inverse  operation  of  the  nest  operator.  It  flattens  out  a  relation 
by  duplicating  the  atomic  attributes  of  a  tuple  for  each  occurance  of  the  relation  valued  attributes. 
Its  structure  is  as  follows: 

p  relation-valued  attribute  (relation) 

To  illustrate  its  operation,  a  query  to  take  the  nested  version  of  the  employee  relation  as 
shown  in  Figure  A.6  and  flatten  it  would  be  written  as: 

p  children  (employee) 

creating  the  unnested  or  flattened  version  of  the  employee  relation  as  shown  in  Figure  A. 5. 
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Appendix  B.  Definition  of  Data  Structures 


The  design  and  implementation  of  the  Triton  nested  database  system  utilizes  several 
data  structures  first  developed  by  Mankus  (21)  that  I  modified  for  my  use.  Because  these 
structures  play  such  a  vital  role  in  understanding  the  design  of  my  operator  and  access 
methods,  each  is  explained  in  detail. 

B.l  Plan  Node 

The  output  of  the  query  optimizer  is  a  pointer  to  the  plan  tree  that  structurally 
represents  the  user’s  request.  Figure  B.l  shows  the  structure  of  a  plan  node. 

The  fields  of  the  plan  node  are  used  as  follows: 

•  method  -  specifies  the  operator  or  access  method  to  be  used  (for  example  filescan, 
loops.join,  create .rel,  or  store-values,  to  name  a  few) 

•  argument. name  -  the  name  of  the  input  relation  (used  only  if  the  method  is 
FILESCAN) 

•  argument. rel type  -  the  type  of  the  input  relation  (used  only  if  the  method  is 
FILESCAN) 

•  argument.struct  jium  -  an  integer  field  that  records  the  greatest  structure  number  of 
the  temporary  relation  templates  that  hold  the  intermediate  data  for  answering  the 
query 

•  argument.pred  -  a  pointer  to  a  predicate  tree  of  pred  nodes  which  represents  the 
conditions  on  the  highest  level  of  the  nested  relation 

•  argument.list  -  a  linked  list  of  list  nodes  which  identify  attributes 

•  argument. aux.pred  -  an  additional  place  to  hang  a  condition  (used  only  in  the 
LOOPS.JOIN  method  when  a  selection  is  simultaneously  taking  place) 

•  argument. auxJist  -  an  additional  place  to  hang  a  linked  list  of  list  nodes  (used  only 
in  the  LOOPS. JOIN  method  when  a  projection  is  simultaneously  taking  place) 

•  input[0]  and  input[l]  -  pointers  to  plan  nodes 
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Figure  B.2.  A  Pred  Node 


B.2  Pred  Node 

The  pred  and  aux.pred  portions  of  the  plan  node  can  each  point  to  a  tree  of  pred 
nodes  that  represent  query  conditions.  Figure  B.2  shows  the  structure  of  &  pred  node. 

The  fields  of  the  pred  node  are  used  as  follows: 

•  oper  -  identifies  the  operator  which  can  be  AND,  OR,  NOT,  LES  (less  than),  GRT 
(greater  than),  LEQ  (less  than  or  equal  to),  GEQ  (greater  than  or  equal  to),  EQL 
(equal  to),  and  NEQ  (not  equal  to) 

•  constant_on_right  -  boolean  variable  to  identify  whether  the  value  stored  in  the  right 
operand  is  a  constant  or  a  variable 

•  ref_name  -  records  the  relation  name  that  the  attribute  identified  in  the  operand 
portion  pertains  to 
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•  uflag  -  a  union  flag  to  record  whether  the  value  stored  in  the  op.type  field  is  a 
character  pointer,  integer,  float  or  a  pointer  to  another  pred  node 

•  op -type  -  holds  either  a  character  pointer,  integer,  float,  or  a  pointer  to  another  pred 
node 

The  pred  node  has  a  left  and  right  operand,  which  are  inputs  to  the  operator.  All 
operators  are  binary  except  for  the  NOT  operator,  which  is  unary  and  only  uses  the  left 
operand.  If  the  operator  is  an  AND,  OR,  or  NOT,  the  op.type  holds  a  pointer  to  another 
pred  node;  otherwise,  it  will  hold  either  a  pointer  to  a  character,  an  integer  value,  or  a 
float  value.  The  left  operand  holds  the  name  of  the  attribute  for  the  condition,  while  the 
right  operand  holds  the  value  of  a  constant  or  may  hold  the  name  of  another  attribute  in 
the  relation.  If  the  latter  case  is  true,  the  value  of  the  constant.on.right  field  would  be 
false. 

To  demonstrate  the  structure  of  a  predicate  tree,  the  predicate 
emp_age>40  and  dept= “Marketing” 

is  structurally  represented  in  Figure  B.3.  In  this  figure,  two  pred  nodes  are  used  to  represent 
the  two  conditions,  while  a  third  pred  node  connects  these  conditions  with  an  AND.  Since 
the  two  attributes  emp.age  and  dept  are  being  compared  to  a  constant  value,  the  value  of 
constanLon.right  for  both  of  the  nodes  at  the  bottom  of  the  tree  is  true.  The  left  operand 
is  always  the  name  of  an  attribute,  thus  the  left  op.type  field  will  always  hold  a  character 
value,  which  is  the  name  of  the  attribute. 

D.3  List  Node 

The  list  and  aux.list  portions  of  the  plan  node  can  each  point  to  a  linked  list  of  list 
nodes  which  represent  attribute  lists  for  use  in  performing  projections.  Figure  B.4  shows 
the  structure  of  a  list  node. 

The  fields  of  the  list  node  are  used  as  follows: 

•  attr  -  points  to  an  attrdesc  node  which  holds  a  description  of  the  attribute  and  is 
defined  in  detail  below 
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Figure  B.3.  A  Predicate  Tree 
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Figure  B.4.  A  List  Node 

•  cond  -  points  to  a  predicate  tree  of  pred  nodes  (only  used  if  the  attribute  is  relation¬ 
valued  and  there  is  a  selection  condition  that  applies  to  the  attributes  of  the  relation¬ 
valued  attribute) 

•  sublist  -  points  to  a  sublist  of  list  nodes  (only  used  if  the  attribute  is  relation- valued 
and  only  some  of  the  attributes  of  the  relation-valued  attribute  are  being  projected 
out) 

•  next  -  points  to  the  next  list  node  in  the  attribute  list 

As  stated  above,  the  list  structure  represents  an  attribute  list.  Relation-valued  at¬ 
tributes  may  have  a  condition  (which  points  to  a  predicate  tree  of  pred  nodes)  and  a  sublist 
(which  points  to  the  attributes  of  the  relation-valued  attribute  that  are  to  be  projected 
out),  "f  the  attribute  is  relation-valued  and  the  sublist  is  null,  all  attributes  of  the  relation- 
vaiae  ->bute  are  to  be  projected  out.  Through  the  use  of  sublists,  projections  can  be 
represented  at  any  level  of  nesting  in  the  relation.  The  list  structure  can  be  used  to  record 
a  selection,  projection,  or  both  a  selection  and  a  projection. 
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The  attr  field  of  the  list  node  points  to  an  aitrdesc  node  which  holds  information 
about  the  attribute.  Figure  B.5  shows  the  structure  of  an  attrdesc  node.  j 

j 

The  fields  of  the  attrdesc  node  are  used  as  follows:  1 

'j 

•  name  -  the  name  of  the  attribute  i 

i 

A 

•  type  -  the  type  of  the  attribute  (CHAR  if  the  attribute  is  a  character  pointer,  INT  J 

if  the  attribute  is  an  integer,  FLOAT  if  the  attribute  is  a  float,  or  PREV .DEFINED  l 

if  the  attribute  is  relation-valued)  I 

•  size  -  the  number  of  characters  if  the  type  is  CHAR,  the  number  of  bytes  if  the  type  i 

is  INT  or  FLOAT  (if  the  type  is  PREV .DEFINED,  then  size  is  0)  ) 

•  value  -  holds  a  character  string  if  the  type  is  CHAR,  an  integer  if  the  type  is  INT,  : 

a  float  if  the  type  is  FLOAT  (used  only  to  specify  the  value  of  an  attribute  if  a  new 

J 

tuple  is  being  added  to  the  relation  or  if  the  value  of  the  attribute  is  being  changed)  i 

\ 

•  rvatype  -  holds  the  name  of  the  type  of  tuple  to  be  stored  in  the  relation-valued  } 

attribute  (only  used  if  type  is  PREV  .DEFINED)  | 


parenlrel  -  holds  the  name  of  the  relation  to  which  this  attribute  belongs 


*  project  -  an  integer  value  that  holds  the  temporary  relation  number  (used  when  a 
temporary  relation  template  is  made  to  hold  the  intermediate  results  of  a  projection 
or  a  join) 
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Appendix  C.  Data  for  the  Employees  Relation 


The  following  files  specify  the  schema  shown  in  Figure  C.l  for  the  employees  nested 
relation.  The  employees  relation  is  a  collection  of  tuples  of  type  emp.  Children  and  projects 
are  relation-valued  attributes  of  emp ,  and  toys  is  a  relation-valued  attribute  of  children. 
The  employees  relation  is  used  in  the  body  of  this  thesis  to  demonstrate  the  function  of 
the  operator  and  access  methods. 

The  schema  for  the  nested  employees  relation  is  contained  in  a  series  of  files.  There 
is  a  .e  and  a  .h  file  describing  the  emp,  child,  toy,  and  project  schemas.  There  is  also  a  .e 
and  .h  file  describing  the  overall  employees  schema.  The  ./i  file  contains  the  structure  of 
the  relation,  while  the  .e  file  contains  the  code  to  implement  the  constructor  and  member 
functions.  Separating  the  schemas  into  pairs  of  files  was  necessary  to  ensure  the  schemas 
are  not  multiply  defined  if  they  are  used  more  than  once  in  a  relation.  In  the  code  that 
follows,  only  the  .e  file  for  the  emp  relation  is  given.  In  the  interest  of  space  and  because 
the  implementation  of  the  constructor  and  member  functions  for  the  child,  toy,  and  project 
relations  are  nearly  identical,  child.e,  toy.e,  and  project.e  are  not  given  here. 
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Figure  C.l.  Schema  for  the  Employees  Nested  Relation 


/*  The  following  is  the  content  of  employee. h  */ 

#ifndef  EMPLOYEES.!! 

#ifndef  EMP.H 
#include  "emp.h" 

#endif 

dbclass  empRVA : collection [emp] ; 

#def ine  EMPLOYEES.!! 

#endif 
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/*  The  following  is  the  content  of  employee. e  */ 

#include  <stream.h> 

#include  <stdio.h> 

#ifndef  EMPLOYEES_H 
#include  "employees. h" 

#endif 

persistent  empRVA  employees; 

/*  The  following  is  the  content  of  emp.h  */ 

#ifndef  EMP.H 
#ifndef  CHILD.H 
#include  "child. h" 

#endif 

#ifndef  PROJECT.H 
#include  "project.h" 

#endif 

dbstruct  emp  { 
dbchar  name [32] ; 
dbint  age ; 
dbint  dno; 

dbclass  childRVA:collection[child]  ; 
childRVA  children; 

dbclass  projectRVA: collection  [project] ; 
projectRVA  projects; 
public: 

emp  (char  *,  int,  int); 

char  *  get .name (); 

void  change.name  (char  *) ; 

int  get.ageO; 

void  change.age  (int) ; 

int  get.dnoO; 

void  change.dno  (int) ; 

void  print  (emp  *) ; 

}; 

#def ine  EMP.H 
#endif 
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/*  The  following  is  the  content  of  emp.e  */ 

#include  <stream.h> 
ttinclude  <stdio.h> 

#ifndef  EHP_H 
#include  "erap.h" 

#endif 

emp::emp  (char  *  nameAtom,  int  ageAtom,  int  dnoAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 
age  =  ageAtom; 
dno  =  dnoAtom; 

> 

char  *  emp: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

} 

void  emp: :change_name  (char  *  nameAtom)  { 
dbchar  *  dost; 
dest  =  name; 

while  (-.■‘dest+t  ~  *nameAtom++) ; 

> 

int  emp: :get_age()  { 
return  (age) ; 

> 

void  emp : : change_ago  (int  ageAtom)  { 
age  =  ageAtom; 

> 

int  emp: :get_dno()  { 
return  (dno) ; 

> 
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void  emp: :change_dno  (int  dnoAtom)  { 
dno  =  dnoAtom; 

> 

void  emp::print  (emp  *  emp_ptr)  •( 

cout  «  .form  ("emp.name:  '/,s\n",  emp_ptr->get_name()) ; 
cout  <<  form  ("emp_age:  '/,d\n" ,  emp_ptr->get_age()) ; 
cout  «  form  ("emp_dno:  '/,d\nM,  emp_ptr->get_dno()) ; 
emp  &  einp_ref_ptr  =  *  emp.ptr; 

iterate  (child  *  child.ptr  =  emp.ref.ptr. children. scan()) 
child.ptr  ->  print  (child_ptr) ; 
iterate  (project  *  project_ptr  =  emp_ref_ptr. projects. scan()) 
project_ptr  ->  print  (project_ptr) ; 

> 

/*  The  following  is  the  content  of  child. h  */ 

#ifndef  CHILD.H 
#ifndef  TOY.H 
#include  "toy.h" 

#endif 

dbstruct  child  { 

dbclass  toyRVA : collection [toy] ; 
toyRVA  toys; 
dbint  age ; 
dbcnar  name [32] ; 
public ; 

child  (int,  char  *); 

int  get_age() ; 

void  change,. age  (int); 

char  *  get .name () ; 

void  change.name  (char  *) ; 

void  print  (child  *) ; 

x. 

> 
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#define  CKILD.H. 

#endif 

/*  The  following  is  the  content  of  toy.h  */ 

#ifndef  TOY.H 
dbstruct  toy  { 
dbchar  color  [32]; 
dbchar  name [32] ; 
public : 

toy  (char  *,  char  *) ; 

char  *  get_color(); 

void  change.color  (char  *) ; 

char  *  get.nameO; 

void  change.name  (char  *) ; 

void  print  (toy  *) ; 

>; 


/*  The  following  is  the  content  of  project. h  */ 

tifndef  PROJECT.H 
dbstruct  project  { 
dbchar  name [32] ; 
dbint  number; 
public: 

project  (char  *,  int) ; 

char  *  get.nameO; 

void  change.name  (char  *) ; 

int  get.numberO ; 

void  change.number  (int) ; 

void  print  (project  *) ; 

>; 


tdefine  PROJECT.H 
# end if 
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Appendix  D.  Data  for  the  Products  Relation 


The  following  files  specify  the  schema  shown  in  Figure  D.l  for  the  products  nested 
relation.  The  products  relation  is  a  collection  of  tuples  of  type  product.  Manufacturers  is 
a  relation- valued  attribute  of  product.  The  products  relation  is  used  in  the  body  of  this 
thesis  to  demonstrate  the  function  of  the  loops.join  operator  method. 

The  schema  for  the  nested  products  relation  is  contained  in  a  series  of  files.  There 
is  an  .e  and  an  .h  file  describing  the  product  and  manufacturer  schemas.  There  is  also  an 
.e  and  .h  file  describing  the  overall  products  schema.  The  .h  file  contains  the  structure  of 
the  relation,  while  the  .e  file  contains  the  code  to  implement  the  constructor  and  member 
functions.  Separating  the  schemas  into  pairs  of  files  was  necessary  to  ensure  the  schemas 
are  not  multiply  defined  if  they  are  used  more  than  once  in  a  relation.  In  the  code  that 
follows,  only  the  .e  file  for  the  product  relation  is  given.  In  the  interest  of  space  and  because 
the  implementation  of  the  constructor  and  member  functions  for  manufacturer  are  nearly 
identical,  the  content  of  manufacturer. e  is  not  given  here. 


name 

price 

manufacturers 
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name  phone  j 

Figure  D.l.  Schema  for  the  Products  Nested  Relation 


/*  The  following  is  the  content  of  products. h  */ 

tfifndef  PRODUCTS.H 
#ifndef  PRODUCT.H 
#include  "product. h" 
ttendif 

dbclass  productRVA:collection[product] ; 

#def ine  PRODUCTS.H 
#endif 
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/*  The  following  is  the  content  of  products.e  */ 

#include  <stream.h> 
ttinclude  <stdio.h> 

#ifndef  PRODUCTS_H 
#include  "products. h" 

#endif 

persistent  productRVA  products; 

/*  The  following  is  the  content  of  product. h  */ 

tifndef  PRODUCT.H 
#ifndef  MANUFACTURER.H 
#include  "manufacturer.h" 

#endif 

dbstruct  product  { 
dbchar  name [32] ; 
dbfloat  price; 

dbclass  manufacturerRVA: collection [manufacturer] ; 
manufacturerRVA  manufacturers; 
public: 

product  (char  *,  float); 

char  *  get.nameO; 

void  change_name  (char  *) ; 

float  get_price(); 

void  change_price  (float) ; 

void  print  (product  *) ; 

>; 


#def ine  PRODUCT.H 
#endif 


/*  The  following  is  the  content  of  product. e  */ 

#include  <stream.h> 

#include  <stdio.h> 

ttifndef  PRODUCT.H 
#include  "product. h" 

#endif 

product: : product  (char  *  nameAtom,  float  priceAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 
price  =  priceAtom; 

> 

char  *  product: : get _name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32]; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

void  product :: change_name  (char  *  nameAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 

} 

float  product:  :get_price()  •{ 
return  (price) ; 

} 

void  product: :change_price  (float  priceAtom)  { 
price  =  priceAtom; 

> 
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void  product: : print  (product  *  product_ptr)  { 

cout  <<  form  ("product .name:  */,s\n",  product_ptr->get_name()) ; 
cout  <<  form  ("product.price:  '/,f\n",  product_ptr->get_price()) ; 
product  &  product_ref_ptr  =  *  product_ptr; 
iterate  (manufacturer  *  manufacturer_ptr  =  product_ref_ptr . 

manufacturers .scan()) 

manufacturer.ptr  ->  print  (manuf acturer.ptr) ; 

> 

/*  The  following  is  the  content  of  manufacturer. h  */ 

#ifndef  MANUFACTURERS 
dbstruct  manufacturer  { 
dbchar  location [32] ; 
dbcliar  name  [32]  ; 
dbint  phone; 
public: 

manufacturer  (char  *,  char  *,  int) ; 

char  *  get_location() ; 

void  change_location  (char  *) ; 

char  *  get_name() ; 

void  change_name  (char  *) ; 

int  get_phone(); 

void  change_phone  (int) ; 

void  print  (manufacturer  *) ; 

>; 


#def ine  MANUFACTURERS 
#endif 
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Appendix  E.  Filescan  Method 


The  following  code  implements  the  filescan  query  depicted  in  Figure  3.6  and  described 
in  Section  3.3.1.  In  the  Triton  system,  the  code  in  this  appendix  would  be  written  to  a  file 
called  query. e  by  the  E  code  generator  codegen  to  implement  the  query.  The  comments 
that  appear  in  the  code  in  this  appendix  are  not  generated  by  codegen  but  have  been  put 
in  by  the  author  to  explain  the  operation  of  the  filescan. 


#include  <stream.h> 

#include  <stdio.h> 

#ifndef  EMPLOYEES.H 
#include  11  employees,  h" 

# end if 

extern  persistent  erapEVA  employees; 

/$>i:*>t:$;|<**$***>i<**%*$******:4<***’fc*****’i‘’fc’l<*’l<****$>l<******$>l'***/ 

/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  toys  relation.  */ 

dbstruct  tempi  { 
dbchar  name [32] ; 
public : 

tempi  (char  *) ; 
char  *  get_name(); 

>5 


/*  Constructor  function  code  for  tempi  */ 

tempi: stempl  (char  *  nameAtom)  { 
dbchar  *  dest; 
desl  -  name; 

while  (*dest++  =  *nameAtom++) ; 

>' 
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/*  Code  for  implementing  member  function  of  tempi  */ 

char  *  tempi: : get .name ()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  children  relation.  */ 

dbstruct  temp2  { 
dbchar  name [32] ; 
dbint  age; 

dbclass  tempiRVA: collection [tempi]  ; 
templRVA  templ.rels; 
public: 

temp2  (char  *,  int); 
char  *  get.nameO; 
int  get.ageO; 

>; 


/*  Constructor  function  code  for  temp2  */ 

temp2::temp2  (char  *  nameAtom,  int  ageAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 
age  =  ageAtom; 

} 
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/*  Code  for  implementing  member  functions  of  temp2  */ 

char  *  temp2: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

int  temp2: :get_age()  { 
return  (age) ; 

> 

/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  employees  relation.  */ 

dbstruct  temp3  { 
dbchar  name  [32] ; 
dbint  age ; 

dbclass  temp2RVA: collection [temp2] ; 
temp2RVA  temp2_rels; 
public: 

temp3  (char  *,  int); 
char  *  get_name(); 
int  get_age() ; 

}; 


/*  Constructor  function  code  for  temp3  */ 

temp3::temp3  (char  *  nameAtom,  int  ageAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 

n  M  M  —  H  (an  A  MW  • 

> 
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/*  Code  for  implementing  member  functions  of  temp3  */ 

char  *  temp3: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char  [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

int  temp3: :get_age()  { 
return  (age) ; 

> 


/*  FII.ESCAN  ITERATOR  */ 
/*  */ 
/*  This  iterator  will  perform  the  filescan  of  the  employees  */ 
/*  relation  and  will  yield  a  pointer  to  the  resultant  projected  */ 
/*  tuple,  (i.e.  a  pointer  to  a  tuple  of  type  temp3)  */ 


iterator  temp3  *  filescan_temp3() 

/*  scan  the  employee  relation  */ 

iterate  (emp  *  left_tuple_ptrl  =  employees.scanQ) 

■C 

/*  selection  condition  */ 

if  (left_tuple_ptrl->get_age()  >  30) 

{ 

/*  copy  the  projected  atomic  attributes  from  employees  */ 
/*  into  temp3  */ 

temp3  *  t3  =  new  temp3  (left_tuple_ptrl->get_name() , 

left_tuple_ptrl->get_age()) ; 
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/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  temp3  */ 

j  *******  *  +  ***  +  :♦:  ***  +  :(:  if:*******:*****************************  / 

temp3  &  temp3_ref  ~  *  t3; 

/  ********************************************************/ 
/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  employees  */ 

/***********;{  ********************************************/ 

emp  &  left_source_ref i  =  *  left_tuple_ptrl; 

I  ********************************  / 

/*  scan  the  children  relation  */ 
/********************************/ 
iterate  (child  *  left_tuple_ptr2  =  left_source_ref 1. 

children.scanO) 

{ 

/  *  *  *  s*c  ft  lit  *  ijt  *  i(t  if:  *  1ft  *  if;  *  *  if;  i(t  if;  lit  *  lie  *  *  / 

/*  selection  condition  */ 
if  (left_tuple_ptr2->get_age()  <  5) 

■c 

/*  copy  the  projected  atomic  attributes  from  */ 
/*  children  into  temp2  */ 

temp2  *  t2  =  in  (temp3_ref .temp2_rels)  new  temp2 
(left_tuple_ptr2->get_name() , 
left_tuple_ptr2->get_age()) ; 

/*  set  up  a  reference  pointer  to  descend  one  */ 

/*  level  of  nesting  in  temp2  */ 

temp2  &  temp2_ref  =  *  t2; 

/$$$$$$$$$$$$$  $$$$$$$£$£$$  f 

/*  set  up  a  reference  pointer  to  descend  one  */ 

/*  level  of  nesting  in  children  */ 

child  &  left_source_ref2  =  *  left_tuple_ptr2; 
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/****#**************#********/ 

/*  scan  the  toys  relation  */ 
/****************************/ 
iterate  (toy  *  left_tuple_ptr3  =  left_source_ref2. 

toys  .scanQ) 

{ 

/*  copy  the  projected  atomic  attributes  */ 

/*  from  toys  into  tempi  */ 

/******************************************/ 
tempi  *  ti  =  in  (temp2_ref .tempi_rels)  new  tempi 
(left_tuple_ptr3->get_name()) ; 

> 


} 

> 

/*  yield  the  projected  tuple,  which  is  a  pointer  to  a  temp3  */ 

/fr***********#***********#****#****#*************************/ 

yield  (t3) ; 

> 

> 

> 

/*  MAIN  PROGRAM  */ 

/*  The  main  program  will  call  the  filescan  iterator  and  print  out  the  */ 
/*  contents  of  the  projected  and  selected  relation.  */ 

/************************************** **********************************/ 
mainO 
{ 

iterate  (temp3  *  left_tuple_ptrl  =  f ilescan_temp3()) 

{ 

cout  <<  form  ("employees.name:  '/is\n", 

left_tuple_ptrl  ->  get_name()); 
cout  <<  form  ("employees. age:  '/.dNn", 

left_tuple_ptrl  ->  get_age()); 
temp3  &  left_source_ref 1  =  *  left_tuple_ptrl; 
iterate  (temp2  *  left_tuple_ptr2  =  left_source_ref i . 

temp2_rels . scan() ) 

•C 

cout  <<  form  ("children.narae:  '/,s\n", 

left_tuple_ptr2  ->  get.nameO); 
cout  <<  form  ("children.age:  '/,d\n" , 

left_tuple_ptr2  ->  get.ageO); 
temp2  &  left_source_ref2  =  *  left_tuple_ptr2; 
iterate  (tempi  *  left_tuple_ptr3  =  left_source_ref2. 
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cout  «  form  ("toys_name:  ’/.sXn", 

left_tuplG_ptr3  ->  go 


> 
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Appendix  F.  Loops  Join  Method 


The  following  code  implements  the  loops.join  query  depicted  in  Figure  3.7  and  de¬ 
scribed  in  Section  3.3.2.  In  the  Triton  system,  the  code  in  this  appendix  would  be  written 
to  a  file  called  query. e  by  the  E  code  generator  codegen  to  implement  the  query.  The 
comments  that  appear  in  the  code  in  this  appendix  are  not  generated  by  codegen  but  have 
been  put  in  by  the  author  to  explain  the  operation  of  the  loops.join. 


#include  <stream.h> 

#include  <stdio.h> 

#ifndef  EMPLOYEES.!! 

#include  11  employees,  h" 

#endif 

extern  persistent  empRVA  employees; 

/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  toys  relation.  */ 

dbstruct  tempi  { 
dbchar  name  T32] ; 
public : 

tempi  (char  *) ; 
char  *  get.nameO; 

>; 


/*  Constructor  function  code  for  tempi  */ 


tempi:: tempi  (char  *  name Atom)  { 
dbchar  *  dest; 


docf  =  r>amo  • 

while  (*dest++  =  *nameAtom++) ; 

> 


/*  Code  for  implementing  member  function  of  tempi  */ 

char  *  tempi: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32]; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

} 

/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  employees  relation.  */ 

dbstruct  temp2  { 
dbchar  name [32] ; 

dbclass  tempiRVA:collection[tempi] ; 
tempiRVA  templ_rels; 
public: 

temp2  (char  *) ; 
char  *  get_name(); 

}» 


/*  Constructor  function  code  for  temp2  */ 

temp2::temp2  (char  *  nameAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 

> 
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/*  Code  for  implementing  member  function  of  temp2  */ 

/***********•+*****>)(*:+:>)(*******  ************************/ 

char  *  temp2: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32]; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

} 

/*** **** **** *** ** **** * ******* ************* ****************** ******* / 


/*  FILESCAN  ITERATOR  */ 
/*  */ 
/*  This  iterator  will  perform  the  filescan  of  the  employees  */ 
/*  relation  and  will  yield  a  pointer  to  the  resultant  projected  */ 
/*  tuple,  (i.e.  a  pointer  to  a  tuple  of  type  temp2)  */ 


iterator  temp2  *  f ilescan_temp2() 

{ 

/*  scan  the  employee  relation  */ 

iterate  (emp  *  left_tuple_ptri  =  employees. scanO) 

{ 

/*  copy  the  projected  atomic  attribute  from  employees  */ 

/*  into  temp2  */ 

temp2  *  t2  =  new  temp2  (left_tuple_ptrl->get_name()) ; 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  temp2  */ 

temp2  &  temp2_ref  =  *  t2; 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  employees  */ 

ern.  &  left_source_ref 1  =  *  lef t_tuple_ptrl ; 
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/*  scan  the  children  relation  */ 

iterate  (child  *  left_tuple_ptr2  =  left_source_ref i. children. scanO) 

{ 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  children  */ 

child  &  left_source_ref2  =  *  left_tuple_ptr2; 

/*  scan  the  toys  relation  */ 

iterate  (toy  *  left.tuple.ptr3  =  left_source_ref2. toys. scanO) 

/*  copy  the  projected  atomic  attribute  from  toys  */ 

/*  into  tempi  */ 

(c****  Jit****  *******  *******  ******  *******  *****>!<*  *****/ 

tempi  *  ti  =  in  (temp2_ref .tempi_rels)  new  tempi 
(left_tuple_ptr3-^  me  O') ; 

> 

> 

/*****  ********************************  *************************/ 

/*  yield  the  projected  tuple,  which  is  a  pointer  to  a  temp2  */ 
/**************************************************************/ 
yield  (t2) ; 

> 

> 

tifndef  PRODUCTS.!! 

#include  "products.h" 

#endif 

extern  persistent  productRVA  products; 
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/***********:*********************************************/ 
/*  Temporary  relation  template  —  holds  the  projected  */ 
/*  attributes  of  the  products  relation.  */ 

dbstruct  temp3  { 
dbchar  name [32] ; 

dbclass  manufacturerRVA: collection  [manufacturer]  ; 
manufacturerRVA  manufacturers; 
public : 

temp3  (char  *) ; 
char  *  get_name(); 

}; 


/#  Constructor  function  code  for  temp3  */ 

temp3::temp3  (char  *  nameAtom)  •[ 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 

} 

/**********!******************************************  J 
/*  Code  for  implementing  member  function  of  temp3  */ 
j  ***********************************  *****************  / 

char  *  temp3:  :get_name()  •{ 
dbchar  *  source  =  name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 
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/#  Jit##***#**##********#********************************##***********/ 


/*  FILESCAN  ITERATOR  */ 
/*  */ 
/*  This  iterator  will  perform  the  filescan  of  the  products  */ 
/*  relation  and  will  yield  a  pointer  to  the  resultant  projected  */ 
/*  tuple,  (i.e.  a  pointer  to  a  tuple  of  type  temp3)  */ 


***$*******  ft***#*  **’1' ft***  ft***  *&*$*>!•’*  ft*** ’I'**** ’I'**  ft**** ’k  * ’I’**  / 

iterator  temp3  *  filescan_temp3() 

{ 

/*  scan  the  products  relation  */ 

iterate  (product  *  left_tuple_ptri  =  products. scanO) 

{ 

jit****  jit*******  ************  IK********  jit#***  ****  ****  **  ** *****/ 

/*  copy  the  projected  atomic  attributes  from  products  */ 

/*  into  temp3  */ 

** * $  s)c  **  *  **  $  *  *  if.  *  ste  *  $  $  *  *  *  *  *  #  *  #  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  ★  *  *  *  *  *  *  *  / 

temp3  *  t3  =  new  temp3  (left_tuple_ptrl->get_name()) ; 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  temp3  */ 

temp3  &  temp3_ref  =  *  t3; 

/fr*******************************************************/ 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  products  */ 

product  &  left_source_ref 1  =  *  left_tuple_ptrl ; 

/*  scan  the  manufacturers  relation  */ 

iterate  (manufacturer  *  left_tuple_ptr2  =  left_source_refl. 

manufacturers .  scanO  ) 

{ 

/************************************************************/ 
/*  copy  the  projected  atomic  attributes  from  manufacturers  */ 
/*  into  the  manufacturer  relation- valued  attribute  of  temp3  */ 

manufacturer  *  manuf acturer_ptr  =  in  (temp3_ref .manufacturers) 
new  manufacturer  (left_tuple_ptr2->get_location() , 
left_tuple_ptr2->get_name() , 
left_tuple_ptr2->get_phone()) ; 

} 
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/*  yield  the  projected  tuple,  which  is  a  pointer  to  a  temp3  */ 

yield  (t3) ; 

} 

> 

/*  Temporary  relation  template  —  holds  the  attributes  */ 

/*  of  the  join  for  the  manufacturers  relation- valued  */ 

/*  attribute  of  the  join  */ 

dbstruct  temp4  { 

dbchar  location [32] ; 
dbchar  name [32] ; 
dbint  phone ; 
public : 

temp4  (char  *,  char  *,  int); 
char  *  get„location() ; 
char  *  get_name(); 
int  get_phone() ; 

>; 


/at'**  ft**  ft*****  ft**#**  ft****  ******  ft?*****/  * 

/*  Constructor  function  code  for  temp4  */ 

/ ***************************************** / 

temp4::temp4  (char  *  locationAtom,  char  *  nameAtom,  int  phoneAtom)  { 
dbchar  *  dest; 
dest  =  location; 

while  (*dest++  =  *locationAtom++) ; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 
phone  =  phoneAtom; 

} 
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/*  Code  for  implementing  member  functions  of  temp4  */ 

char  *  temp4: :get_location()  { 
dbchar  *  source  =  location; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

char  *  temp4: : get .name ()  { 
dbchar  *  source  *  name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

} 

int  temp4: :get_phone()  { 
return  (phone) ; 

} 

/*  Temporary  relation  template  —  holds  the  attributes  */ 

/*  of  the  join  at  the  join_level  */ 

dbstruct  temp5  { 

dbchar  toyslname[32] ; 
dbchar  products .name [32] ; 
dbclass  temp4RVA : collection [temp4] ; 
temp4RVA  temp4_rels; 

public : 

temp5  (char  *,  char  *); 
char  *  get_toys_name() ; 
char  *  get_products_name() ; 

>; 
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/*  Constructor  function  code  for  temp5  */ 

terap5::temp5  (char  *  toys.nameAtom,  char  *  products_nameAtom)  { 
dbchar  *  dest; 
dest  =  toys_name; 

while  (*dest++  =  *toys_nameAtom++) ; 

dest  =  products.name; 

while  (*dest++  =  *products_nameAtom++) ; 

> 

/*  Code  for  implementing  member  functions  of  temp5  */ 

char  *  temp5:  :get_toys_name()  ■£ 
dbchar  *  s^uct  -  toys_name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  -  *source++) ; 
return  (start) ; 

} 

char  *  temp5: :get_products_name()  { 
dbchar  *  source  =  product s_name; 
char  *  dest  =  new  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 

/*  Temporary  relation  template  —  holds  the  attributes  */ 

/*  of  the  join  for  the  employees  attribute  of  the  join  */ 

dbstruct  temp6  { 
dbchar  name [32] ; 

dbclass  temp5RVA: collection [temp5] ; 
temp5RVA  temp5_rels; 
public : 

temp6  (char  *) ; 
char  *  get_name(); 

>; 
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/*  Constructor  function  code  for  temp6  */ 

temp6::temp6  (char  *  nameAtom)  { 
dbchar  *  dest; 
dest  =  name; 

while  (*dest++  =  *nameAtom++) ; 

> 

/  *  *  *  *  *  *  *  * * *  *  £  *  *  *  *  *  *  *  *  *  j(c  *  *  *  *  *  *  4!  J(t  #  *  *  %  3|e  *  *  *  j(e  **  ft  $  *  *  %  *  **  *  *  $  / 

/*  Code  for  implementing  member  function  of  temp6  */ 

/  *  %  *  *  s|c  $  *  *  *  *  *  s|e  *  *  !(<  *  *  *  *  *  £  *  +  +  *  *>((*!)<  *  Jit*#*  *  **  %  %  *  *  %  *  *  *  *  *  He  **  / 


char  *  temp6: :get_name()  { 
dbchar  *  source  =  name; 
char  *  dest  =  now  char [32] ; 
char  *  start  =  dest; 
while  (*dest++  =  *source++) ; 
return  (start) ; 

> 


/******************** *********** ******* *************** ********* ****/ 


/*  L00PS_J0IN  ITERATOR  */ 
/*  */ 
/*  This  iterator  will  perform  the  join  of  the  products  relation  */ 
/*  to  the  employees  relation  and  will  yield  a  pointer  to  the  */ 
/*  resultant  tuple,  (i.e.  a  pointer  to  a  tuple  of  type  tempd)  */ 


/******************************************************************/ 


iterator  temp6  *  loops. join_temp6() 

{ 

/*****************************************************/ 
/*  scan  the  temp2  relation  using  filescan  iterator  */ 

iterate  (temp2  *  left_tuple_ptrl  =  f ilescan_temp2()) 


I  Sfcjfc* ********************** *♦*#**♦** **************** *******/ 

/*  copy  the  atomic  attributes  from  left  relation  into  */ 
/*  temp6  */ 
/  *************  ^fc***************************************5^*2!5/ 


temp6  *  t6  =  new  temp6  (left_tuple_ptrl->get_name()) ; 
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/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  temp6  */ 


temp6  &  terap6_ref  =  *  t6; 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  the  left  relation  */ 

j  ****  *  *  *  *  *  *  *  **  *  **  *  ********************  *  **  ********>>  *******/ 
temp2  &  left_source_.ref  1  «  *  lef t^tuple^ptrl ; 

/*  scan  the  tempi  relation  */ 

iterate  (tempi  *  left_tuple_ptr2  =  left_source_ref 1. 

templ.rels .  scanO ) 

< 

/  *****************************************************  / 
/*  scan  the  temp3  relation  using  filescan  iterator  */ 

/  *  *  *  %  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  $  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  sfc  *  3te  *  *  *  / 


iterate  (temp3  *  right.tuple.ptrl  =  f ilescan_temp3()) 

{ 

/*  join  condition  */ 

if  (strcmp  (left„tuple_ptr2->get_name() ,  right_tuple_ptrl-> 

get_name())  ==  0) 


{ 

/*  copy  the  atomic  attributes  from  left  and  right  */ 
/*  relations  into  temp5  */ 

J  )|c  ^  ^  ^  ^  ^  ^  »|^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  s|/  ^  ^  ^  ^  ^  ^  ^  ^  ^  J 

temp5  *  t5  =  in  (temp6_ref .temp5_rels)  new  temp5 
(left_tuple_ptr2->get_name() , 
right_tuple_ptri->get_name()) ; 

/  *  $  *  >|c  *  He  *  *  >|:  *  *  *  *  *  *  %  *  *  *  *  *  *  *  *  *  *  *  #  *  *  *  *  *  *  *  *  *  *  *  sic  *  *  *  *  #  *  *  / 

/*  set  up  a  reference  pointer  to  descend  one  */ 

/*  level  of  nesting  in  temp5  */ 


temp5  &  temp5_ref  =  *  t5; 
/***********************************************/ 
/&  set  up  a  reference  pointer  to  descend  one  £/ 
/*  level  of  nesting  in  right  relation  */ 

/***********************************************/ 


temp3  &  right_source_ref 1  =  *  right  .tuple_ptri; 
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/***#  %****)►  #**  *******************  *  **)►;**  sic**/ 

/*  scan  temp3’s  manufacturers  relation  */ 

/%*  *  *  **  )(<  *  *  $  *  $  *  $  %  *  %  *  *  *  **  *  *  **  *  *  *  *  *  *  3(c  *  *  *  sfe  *!,'*>  *  / 

iterate  (manufacturer  *  right _tuple_ptr2  - 

right_source_ref 1  .manuf  acturers .  r.  ;an( ' ) 

{ 

/if******************************************/ 

/*  copy  the  atomic  attributes  from  right  */ 

/*  relation  into  temp4  */ 

/iie*********  ********)!(************************/ 

temp4  *  t4  =  in  (temp5_ref .temp4_rels)  new  temp4 
(right_tuple_ptr2->g«jt_location() , 
right_tuple_ptr2->get_name() , 
right_tupie_ptr2->get_phune()) ; 


/  j(!  *  *  *  *  *  9jc  £  *  >|<  *  *  *  J(C  *  *  *  $  *  *  >|c  3fe  *  *  $  ft  9|e  sfe  9(e  *  ft  ft  ft  )(c  a)c  +  af:  ft  ft  ft  ft  ft  *  *  ft  *  ft  9fc  ft  ft  %  $  * ft ftftft  / 

/*  yield  the  joined  tuple,  which  is  a  pointer  tc  a  temp6  */ 

yield  (t6) ; 

> 


/**********************************  *****************  ***********  ft*ftftft*/ 

/*  MAIN  PROGRAM  */ 

/*  */ 

/*  The  main  program  will  call  the  join  iterator  and  print  out  the  */ 

/*  contents  of  the  joined  relation.  */ 

/********************************************************************/ 
mainO 
{ 

iterate  (temp6  *  left_tuple_ptrl  =  loops_join_temp6()) 

•C 

cout  «  form  ("employees_name:  '/,s\nM, 

left_tuple_ptrl  ->  get_name()); 
temp6  &  left_source_ref 1  =  *  lef t_tuple_ptrl ; 
iterate  (temp5  *  left_tuple_ptr2  =  left_source_ref i. 

temp5_rels .  sc.an()  ) 

{ 

cout  «  form  ("toys_toys_name:  '/,s\n",  left_tuple_ptr2  -> 

get_toys_name()) ; 

cout  <<  form  (Mproducts_products_name:  '/,s\n",  left_tuple_ptr2  -> 

get_products_name()) ; 
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temp5  ft  left_source_rsf2  =  *  left_tuple_ptr2; 
iterate  (temp4  *  left_tuple_ptr3  =  left_source_ref2. 

temp4_rels.scan()) 

{ 

cout  «  form  (Mmanufacturers_location:  '/,s\nM, 

left_tuple_ptr3  ->  get./location()) ; 

cout  <<  form  ("manufacturers .name:  7.s\n", 

left_tuple_ptr3  ->  get.nameO); 

cout  «  form  ("manufacturers.phone:  '/,d\n", 

loft_tuple„ptr3  ->  get„phone()) ; 

} 

> 

> 

} 
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Appendix  G.  Store-Values  Method 


The  following  code  implements  the  store-values  query  depicted  in  Figure  3.8  and 
described  in  Section  3.3.3.  In  the  Triton  system,  the  code  would  be  written  to  a  file  called 
query,  e  by  the  E  code  generator  codegen  to  implement  the  query.  The  comments  that 
appear  in  the  code  in  this  appendix  are  not  generated  by  codegen  but  have  been  put  in  by 
the  author  to  explain  the  operation  of  the  store-values  method. 


#include  <stream.h> 
#include  <stdio.h> 

#ifndef  EMPLOYEES.!! 
#include  " employees. hl! 
#endif 


extern  persistent  empRVA  employees; 


mainO 

{ 

/*  scan  the  employee  relation  */ 

iterate  (emp  *  left_tuple_ptri  =  employees .scan()) 

{ 

/*  condition  statement  */ 

if  (strcmp  (left_tuple_ptrl->get_name() , "David")  ==  0) 

{ 

/*  set  up  a  reference  pointer  to  descend  oi.e  level  of  */ 
/*  nesting  in  employees  */ 


emp  &  left_source_ref 1  =  *  left_tuple_ptri; 

/*  scan  the  children  relation  */ 

iterate  (child  *  left_tuple_ptr2  =  left_source_ref 1 . 

children.  scanO) 
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} 


> 


/*  condition  statement  */ 


if  (strcmp  (left_tuple_ptr2->get_name() , "Florence")  “  0) 

{ 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  children  */ 

child  &  left_source_ref2  =  *  left_tuple_ptr2; 

/*  insert  two  new  "toy"  tuples  into  toys  */ 


toy  *  toy.ref  =  in  (left_source_ref2.toys) 
new  toy  ("car",  "black"); 
toy  *  toy_ref  =  in  (left_source_ref2.toys) 
new  toy  ("truck",  "blue"); 


> 


> 

/*  insert  a  new  "project"  tuple  into  projects  */ 

project  *  project.ref  =  in  (left_source_refi. projects) 
new  project  ("AWANA",  384); 


} 
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Appendix  H.  Modify  Method 


The  following  code  implements  the  modify  query  depicted  in  Figure  3.9  and  described 
in  Section  3.3.4.  In  the  Triton  system,  the  code  would  be  written  to  a  file  called  query. e  by 
the  E  code  generator  codegen  to  implement  the  query.  The  comments  that  appear  in  the 
code  in  this  appendix  are  not  generated  by  codegen  but  have  been  put  in  by  the  author  to 
explain  the  operation  of  the  modify  method. 


#include  <stream.h> 

#include  <stdio.h> 

tifndef  EMPLOYEES.!! 

#include  "employees. h" 

#endif 

extern  persistent  empRVA  employees; 

mainO 

{ 

/*  scan  the  employee  relation  */ 

iterate  (emp  *  left_tuple_ptrl  =  employees. scanO) 

{ 

/*  condition  statement  */ 

if  (strcmp  (left_tuple_ptri->get_name() , "David")  ==  0) 

{ 

/*  *  ****  *  *  *  ****  *****  ********  j);***#*  ****  **  *  ******  ****  ****  ***  / 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  employees  */ 

/  ***************************************  it:*##**  ******  *****/ 

emp  &  left_source_ref 1  =  *  lef t_tuple_ptrl ; 

/ *  scan  the  children  relatxon  / 

iterate  (child  *  left_tuple_ptr2  =  left_source_ref 1. 
children.  scanO) 

■C 
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/*  condition  statement  */ 

if  (strcmp  (left_tuple_otr2~>get_name() ,"Flo")  ==  0) 

Z**************************************  ************/ 
/*  invoke  member  function  to  change  child’s  age  */ 

/*  *  **  **  5(C  *  *********  *  ****  #  ****  5*c  **  *****  ******  *********  / 

left_tuple_ptr2->change_age  (4) ; 

/*  *  *  *  *  *  *  *  %  %  *  *  *  *  s|e  *  *  *  *  *  *  *  *  3(C  **  s|c  *  3fc  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  $  *  $  / 

/*  set  up  a  reference  pointer  to  descend  one  */ 

/*  level  of  nesting  in  children  */ 

child  &  left_source_ref2  =  *  left_tuple_ptr2; 

/*  scan  the  toys  relation  */ 

iterate  (toy  *  left_tuple_ptr3  =  left_source_ref2. 

toys  .scanO) 

{ 

/*  invoke  member  function  to  change  */ 

/*  the  color  of  the  toys  */ 

J  «jc  )jc  ^  ^  )|C  3j(  5|C  )|C  )jc  )|c  5|C  3|C3|CfjC  )|C  )jc  5^C  3jc  5|(  3|(  3j(  5|C  3$C  J 

left_tuple_ptr3->change_color  ("blue") ; 

> 


/*  scan  the  projects  relation  */ 

iterate  (project  *  left_tuple_ptr4  =  left_source_ref i. 

pro  j  ects .  scanO  ) 

{ 

/*  condition  statement  */ 
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if  (strcrap  (left_tuple_ptr4->get_name() ,"BNAD")  ==  0) 

{ 

sic*#*  ****!(<  it***  *********  ♦’tot***  **********/ 

/*  invoke  member  function  to  change  */ 

/*  the  name  of  the  project  */ 

/*  *  *  *  *  *  *  *  **  *  *  *  *  *  *  *  *  **  *  *  *  *  *  **  *  *  ****  *  *  *  *  / 
left_tuple_ptr4->change_name  ("TROY") ; 

> 

> 

> 

} 

> 
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Appendix  I.  Delete  Method 


The  following  code  implements  the  delete  query  depicted  in  Figure  3.10  and  described 
in  Section  3.3.5.  In  the  Triton  system,  this  code  is  written  to  a  lile  called  query. e  by  the  E 
code  generator  codegen  to  implement  the  query.  The  comments  that  appear  in  the  code  in 
this  appendix  are  not  generated  by  codegen  but  have  been  put  in  by  the  author  to  explain 
the  operation  of  the  delete  method. 


#include  <stream.h> 
#include  <stdio.h> 

#ifndef  EMPLOYEES.!! 
#include  "employees. h" 
#endif 


extern  persistent  empRVA  employees; 


mainO 

{ 

J  )j(  i)(  sfc  i|(  ijc  sfc  i|(  ijc  ijc  j|(  sfc  jjc  3jc  ijc  ijc  sfc  sfc  ifc  sfc  $  ijc  jjc  sfc  i|c  ijc  ■((  ijc  s)c  i|(  ijc  J 

/*  scan  the  employee  relation  */ 

iterate  (emp  *  left_tuple_ptri  =  employees. scanO) 

{ 

/*  condition  statement  */ 

if  (strcmp  (left_tuple_ptrl->get_name() , "David")  ==  0) 

{ 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 
/*  nesting  in  employees  */ 

emp  &  left_source_ref 1  =  *  lef t_tuple_ptrl ; 

/*  scan  the  children  relation  */ 

iterate  (child  *  left_tuple_p't,r2  =  left_source_refl. 

children .  scanO ) 


{ 
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/*  condition  statement  */ 

if  (strcmp  (left_tuple_ptr2->get_name() ,"Flo")  ==  0) 

{ 

/*  set  up  a  reference  pointer  to  descend  one  level  of  */ 

/*  nesting  in  children  */ 

child  &  left_source_ref2  =  *  left_tuple_ptr2; 

/*  scan  the  toys  relation  */ 

iterate  (toy  *  left_tuple_ptr3  =  left_source_ref2.toys.scan()) 

{ 

/*  delete  the  toy  tuple  */ 

delete  left_tuple_ptr3; 

> 

> 

} 

/*  scan  the  projects  relation  */ 

iterate  (project  *  left_tuple_ptr4  =  left_source_refl.projects.scan()) 

{ 

/*  delete  the  project  tuple  */ 

delete  left_tuple_ptr4; 

} 

} 

> 

> 
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Appendix  J.  SQL/NF  Create  Table  Definition  for  the  Relational  Version  of  the 

IDEFq  Language  Data 


The  SQL/NF  statements  to  create  the  schema  for  the  relational  version  of  the  IDEFo 
language  data  are  as  follows: 


CREATE  TABLE  act2act_table  ( 

parent_node  INT  4, 
child.node  INT  4) 

CREATE  TABLE  act2data_table  ( 
node_id  INT  4, 
data_id  INT  4, 
icom_type  CHAR  2) 

CREATE  TABLE  act2hist_table  ( 
node.id  INT  4, 
hist _ id  INT  4) 

CREATE  TABLE  act2ref_table  ( 
node_id  INT  4, 
ref.id  INT  4) 

CREATE  TABLE  activity_table  ( 
node.id  INT  4, 
node  CHAR  21, 
name  CHAR  26, 
project_id  INT  4, 
author_id  INT  2, 
version  CHAR  11, 
date  CHAR  9, 
x  INT  2, 
y  INT  2, 

visible_DRE  INT  1, 
sheet_id  INT  4) 

CREATE  TABLE  act_changes_table  ( 
node.id  INT  4, 
changes  CHAR  61) 

CREATE  TABLE  act_descr_table  ( 
node.id  INT  4, 
line_no  INT  2, 
desc_iine  CHAR  61) 

CREATE  TABLE  alias.table  ( 
data.id  INT  4, 
name  CHAR  26, 
where_used  CHAR  26, 
comment  CHAR  26) 


CREATE  TABLE  analyst_table  ( 
author_id  INT  2, 
author  CHAR  21) 

CREATE  TABLE  arrow.table  ( 

symbol_id  INT  4, 
arrow_type  INT  1) 

CREATE  TABLE  boundary_table  ( 
symbol_id  INT  4, 
icora.code  CHAR  3) 

CREATE  TABLE  data2data_table  ( 
parent_data  INT  4, 
child_data  INT  4) 

CREATE  TABLE  data21abel_table  ( 
data.id  INT  4, 
label.id  INT  4) 

CREATE  TABLE  data2ref_table  ( 
data_id  INT  4, 
ref.id  INT  4) 

CREATE  TABLE  data2value_table  ( 
data.id  INT  4, 
value.id  INT  4) 

CREATE  TABLE  data_changes_table  ( 
data_id  INT  4, 
changes  CHAR  61) 

CREATE  TABLE  data_descr_table  ( 
data_id  INT  4, 
line_no  INT  2, 
desc_line  CHAR  61) 

CREATE  TABLE  data_elem_table  ( 
data_id  INT  4, 
name  CHAR  26, 
project_id  INT  4, 
author_id  INT  2, 
version  CHAR  11, 
date  CHAR  9) 

CREATE  TABLE  data_range_table  ( 
data.id  INT  4, 
range_data  CHAR  61) 

CREATE  TABLE  data_type_table  ( 
data.id  INT  4, 
type  CHAR  26) 

CREATE  TABLE  data_value_tablel  ( 
value_id  INT  4, 


value  CHAR  16) 


CREATE  TABLE  dot_table  ( 

symbol_id  INT  4, 
dot.type  INT  1) 

CREATE  TABLE  footnote.table  ( 
graf_id  INT  4, 
x  INT  2, 
y  INT  2) 

CREATE  TABLE  feo_table  ( 

graf_id  INT  4, 
picture  CHAR  61) 

CREATE  TABLE  graphics_table  ( 
graf_id  INT  4, 
shcet_id  INT  4) 

CREATE  TABLE  hist_call_table  ( 
hist_id  INT  4, 
hist.proj  CHAR  21, 
hist_node  CHAR  13) 

CREATE  TABLE  label.table  ( 

label.id  INT  4, 
name  CHAR  11, 

X  INT  2, 
y  INT  2, 
sheet_id  INT  4) 

CREATE  TABLE  min_max_table  ( 
data_id  INT  4, 
minimum  CHAR  16, 
maximum  CHAR  16) 

CREATE  TABLE  note_table  ( 
graf_id  INT  4, 
label  INT  2, 
x  INT  2, 
y  INT  2) 

CREATE  TABLE  note_text_table  ( 
graf.id  INT  4, 
line.no  INT  2, 
text.line  CHAR  61) 

CREATE  TABLE  project.table  ( 
project.id  INT  4, 
name  CHAR  13) 

CREATE  TABLE  reference.table  ( 
ref.id  INT  4, 
line.no  INT  2, 
ref.line  CHAR  61) 


CREATE  TABLE  ref_type_table  ( 
ref_id  INT  4, 
type_ref  CHAR  26) 

CREATE  TABLE  segment_table  ( 
seg_id  INT  4, 
data_id  INT  4, 
sheet_id  INT  4, 
xs  INT  2, 
ys  INT  2, 
xe  INT  2, 
ye  INT  2) 

CREATE  TABLE  sheet_table  ( 
sheet_id  INT  4, 
c ..number  INT  4) 

CREATE  TABLE  squiggle_table  ( 
graf.id  INT  4, 
xl  INT  2, 
yl  INT  2, 
x2  INT  2, 
y2  INT  2, 
x3  INT  2, 
y3  INT  2, 
x4  INT  2, 
y4  INT  2) 

CREATE  TABLE  tunnel_table  ( 
symbol.id  INT  4, 
seg_id  INT  4, 
sheet.id  INT  4, 
x  INT  2, 
y  INT  2) 

CREATE  TABLE  to_from_all..table  ( 
symbol_id  INT  4, 
tfa.label  CHAR  2) 

CREATE  TABLE  symbol_table  ( 
symbol_id  INT  4, 
tunnel_type  INT  1) 

CREATE  TABLE  turn_table  ( 

symbol_id  INT  4, 
turn_type  INT  1, 
node_id  INT  4, 


Appendix  K.  SQL/NF  Create  Table  Definition  for  the  Nested  Version  of  the 

JDEFq  Language  Data 


The  SQL/NF  statement  to  create  the  schema  for  the  nested  version  of  the  IDEFo 
language  data  is  as  follows: 


CREATE  TABLE  projects  ( 

project.name  CHAR  13, 

TABLE  activities  ( 
node_id  INT  4, 
node  CHAR  21, 
name  CHAR  26, 
author  CHAR  21, 
version  CHAR  11, 
date  CHAR  9, 
changes  CHAR  61, 
c_number  INT  4, 
parent  CHAR  26, 

TABLE  act_descr  ( 

line.no  INT  2, 
desc.line  CHAR  61) , 

TABLE  references  ( 

ref .type  CHAR  26, 

TABLE  ref.lines  ( 

line.no  INT  2, 
line.ref  CHAR  61)), 
TABLE  hist.calls  ( 

hist.proj  CHAR  13, 
hist.node  CHAR  21) , 

TABLE  data.elems  ( 

data.name  CHAR  26, 
icom.type  CHAR  2) , 

TABLE  children  ( 

node.name  CHAR  26) , 

TABLE  data.elements  ( 
data.id  INT  4, 
name  CHAR  26, 
author  CHAR  21, 
version  CHAR  11, 
date  CHAR  9, 
changes  CHAR  61, 
parent  CHAR  26, 

TABLE  data.descr  ( 

Iine.no  INT  2, 
desc.line  CHAR  61) , 

TABLE  references  ( 

ref .type  CHAR  26, 

TABLE  ref.lines  ( 

line.no  INT  2, 
line.ref  CHAR  61)), 
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TABLE  aliases  ( 

name  CHAR  26, 

»here_used  CHAR  26, 
comment  CHAR  26) , 

TABLE  min.maxes  ( 

data_type  CHAR  26, 
minimum  CHAR  16, 
maximum  CHAR  16) , 

TABLE  ranges  ( 

data_type  CHAR  26, 
range_val  CHAR  61) , 

TABLE  values  ( 

data.type  CHAR  26, 
actual. value  CHAR  16), 

TABLE  activities  ( 

node.name  CHAR  26, 
icom.type  CHAR  2) , 

TABLE  children  ( 

data.name  CHAR  26) ) , 

TABLE  sheets  ( 

c.number  INT  4, 
node  CHAR  21, 
name  CHAR  26, 
author  CHAR  21, 
version  CHAR  11, 
date  CHAR  9, 

TABLE  boxes  ( 

node  CHAR  21, 
name  CHAR  26, 
x  INT  2 
y  INT  2, 

visible.dre  INT  2) , 

TABLE  segments  ( 

data.id  INT  4, 

TABLE  location  ( 
xs  INT  2, 
ys  INT  2, 
xe  INT  2, 
ye  INT  2) , 

TABLE  symbols  ( 
x  INT  2, 
y  INT  2, 

symbol.type  CHAR  26, 
type.symbol  CHAR  26)), 
TABLE  squiggles  ( 
xl  INT  2, 
yl  INT  2, 
x2  INT  2, 
y2  INT  2, 

X3  InT  2, 
y3  INT  2, 
x4  INT  2, 
y4  INT  2) , 

TABLE  meta.notes  ( 
label  CHAR  2. 
x  INT  2, 
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y  INT  2, 

TABLE  note.texts  ( 
line.no  INT  2, 
text.line  CHAR  61)), 
TABLE  foot_notes  ( 
label  INT  2, 
xu  INT  2, 
yu  INT  2, 
xn  INT  2, 
yn  INT  2, 

TABLE  note.texts  ( 
line.no  INT  2, 
text.line  CHAR  61)), 

TABLE  feos  ( 

label  CHAR  2, 
x  INT  2, 
y  INT  2, 

picture  CHAR  61) , 

TABLE  labels  ( 

data.id  INT  4, 
name  CHAR  11, 
x  INT  2, 
y  INT  2))) 
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